Overview

ZEAMAP is a database for pan-zea genomics and maize genetic improvements by integrating the maize multidimensional data including genom assemblies, transcriptomes, pan-genomes, genetic variations, genetic mappings and evolutionary selection sites.


1 General



The top navigation menu (above) gathers general functions of the database, including links to different resources and tools, and entrances for users to register/login.

The site wide search tool can be accessed on the homepage:



and the title bar of each page in order to quickly search for features users just browsed.

Wildcard search with *. Examples:

  1. genom* sequence (matches: genome, genomic, genomics ...)
  2. Lir*dron tulipifera (matches: Liriodendron tulipifera)

Fuzzy search: When you don’t know how to exactly spell the keywords, you can use fuzzy search. Fuzzy search allows you to search for similar words. You use the ~ character at the end of your keyword for fuzzy search (keyword~). Examples:

  1. sequeeence~ (matches: sequence)
  2. Alnus rhmifolia~ (matches: Alnus rhombifolia)

Regular expression search: wrapping keywords with forward slash (/). Examples:

  1. /transcriptom[a-z]+/ (matches: transcriptome, transcriptomes, transcriptomics ...)

Boolean operators: + and -. + means must present; - means must not present. Examples:

  1. +"green ash" +transcriptome -genome (excludes the word genome)
  2. +"green ash" -transcriptome +genome (includes the word genome)

AND, OR, NOT operator and combination search. Examples:

  1. "heat stress" AND ("Castanea mollissima" OR "green ash") NOT "heat shock"

The search results were grouped into categories on the right for efficient data filtering:

The search results on the left provide basic information and links to the detailed feature pages.

2 Genomics

This resource gathers the collection of maize and teosinte genomic datasets including reference genome assemblies, annotations, gene expressions and chromatin interactions. It also provide tools for browsing features, searching sequences and visualizing datasets.

2-1 Species

This page contains the basic information about the current species/germplasm, and the related external links such as genom assembly datasets, taxonomy information and so on.

2-2 Genome features

We have provide two tools for searching for genome annotated features: Search genes and Search features.

This tool provides filtering for genes/mRNAs/proteins based on their IDs or functional annotations.

This tool provides filtering for all annotated features based on the feature IDs or their locations.

2-2-3 Feature details

TODO: [Screen shots for gene detail page]

2-3 Genome browser

We have provided two genome browsers for visualizing the genomic features: Jbrowse and WashU browser.

2-3-1 Jbrowse

ZEAMAP provides a Jbrowse instance for visualizing genomic data, there are many online tutorials on how to use Jbrowse, such as Jbrowse Documentation. And there is also a JBrowse tutorial video for more details about how to navigate and use JBrowse.

2-3-2 WashU browser

ZEAMAP also provides a WashU Epigenome Browser (version: v48-4-4) instance to better visualize the chromatin interaction datasets. Too learn more about how to navigate and use WashU Epigenome Browser, please visit its official documentation:

WashU Epigenome Browser Documentation

2-4 3D interaction

2-4-1 Browse by table

This section provides collection of the long-range chromatin interaction information in maize genome. You can search for chromatin interaction records by their IDs, the antibody type, the interaction regions and the number of supported reads.



Click on the “Search” button will generate the result table:

The details of each fields:

  1. ID: The chromatin interaction IDs. Naming convention: [Genome][Antibody][UniqueID]
  2. Species: The reference genome assembly used.
  3. Type: the antibody type of the dataset.
  4. Distance: the distance (Kb) between the chromatin interaction. Interchromosome interactions were marked as “InterChr”
  5. Region1: the 1st region of the chromatin interaciton (Format: [Chromosome]:[start]-[end]). Linking to WashU Genome browser by left-click.
  6. Region2: the 2nd region of the chromatin interaciton (Format: [Chromosome]:[start]-[end]). Linking to WashU Genome browser by left-click.
  7. SupportReads: the number of reads supportted this chromatin interaction.

2-4-2 Browse in genome browser

In addition to the table browser, the chromatin interaction information can also be viewed throw WashU browser.

2-5 BLAST

Users can compare their query sequences of proteins/nucleotides with the genome/annotated features in the database using BLAST (basic local alignment search tool).

Learn more about BLAST

NCBI BLAST Home page

below is the interface of running BLAST, you only need four steps to perform a BLAST search:

Step 1. Upload the query sequences

To perform a sequence search, you can paste your sequences in the query region or drag a sequence file to the query region. The sequence type (protein/nucleotides) can be detected automaticly.

Note: Both raw sequence or multi-fasta format are supported when paste from clipboard, but sequences uploaded from a file should only be in fasta format. Learn more about fasta format here

Step 2. Select databases

After uploading your query sequences, you can select one or more databases to search.

Note: Only one type of database (Nucleotide or Protein) can be selected.

Step 3. Select parameters

The Advanced parameters input box allows you to run BLAST with your custom parameters. You could click on the “?” button to view the avaliable parameters. If you let the box blank, the default parameters will be used.

Learn more about BLAST parameters here

Step 4. Perform BLAST

Once you have finished the former steps, the proper sub-program of BLAST (BLASTn, BLASTx, BLASTp etc.) is selected automatically according to the type of your query sequences and the databses, and the BLAST button changes accordingly. Click on the BLAST button to perform the analysis, and you will see a status page like this:

After the BLAST is done, it will lead you to the result page automatically.

2-5-2 The BLAST result page

Below is an example of the BLAST result page, the result page can be generally divided into 7 parts:

Part 1: General information

This part contains the general information of the mission, including the version of the program, the submitting time, the database information and the parameters used.

Part 2: Category of query sequences

This part indicates results for each sequence of your query sequences. Click on the query sequence ID will lead you to the details of that query sequence.

Part 3: Circos plot



This part is a circos plot indicates the mapping information between the query sequences and the similar sequences in the database. Mouse over the ribbon will show the identity and Evalue of that alignment.

Part 4: Download Category

This part allows you to download different format of all the results into your local machine.

Part 5: Graphical overview

This part shows the BLAST hits of each query sequence. Each bar indicates one hits in the database, and the color of the bar deepens when the hit is stronger. Mouse over the bar shows the sequence ID and the Evalue of that hit, mouse click on the bar leads you to the detailed alignment information of that hit.

Part 6: Length distribution of hits



This is a histogram of the length of the similar sequences in the database. Mouse over the histogram shows the ID, Evalue and length of the sequence.

Part 7: List veiw

This is a listview of the BLAST hit results including sequence name, query coverage, total score, E value and identity. Mouse click on the sequence name leads you to the detailed alignment of that hit.

Part 8: Alignment details

A: Check on the Select box so that you can download only the results of the select records from Part 2. Mouse click on Sequence will show the detailed sequences. Mouse click on FASTA and Alignment will download the fasta format sequence and the alignment result, respectively.

B: the graphical overview and the alignment.

2-6 Synteny

The conserved syntenic blocks among the Zea genomes were analyzed using BLASTp and MCScanX (their parameters), and the visualization is performed using the Tripal Synteny Viewer. The conserved syntenic blocks between a selected chromosome of a genome and another genome can be displayed interactively in both a circular and tabular layout, and the detailed gene information in each block is also displayed.

2-6-1 Select genomes

To get the synteny block information, you can either select a query chromosome ( in the figure below) and a target genome ( in the figure below), or request for a certain block ID if you already knew one ( in the figure below). And click on the Search button to get the results.

2-6-2 Synteny block overview

The synteny viewer provides both circular ( in the figure below) and tabular ( in the figure below) layouts for the resulted blocks.

In the circular view, the blue bar indicates the query chromosome and the red bars indicates the target genome, the ribbons indicate the synteny blocks, mouse over each ribbon shows the block ID and the regions, mouse click on the ribbon leads you to the detailed block information page. In the tabular layout, mouse click on the block ID also leads you to the detailed block information.

2-6-3 Detailed block information

The detailed synteny block information page includes 3 parts: the block information part ( in the figure below), the visualization part ( in the figure below) and the tabular view part ( in the figure below). Hold mouse over the visualization part and scroll the mouse steel can zoom in/out the visualization. Mouse click on the gene IDs in both the visualization and the tabular layout leads to the gene details page.

2-7 Gene expression pattern

You can search for the gene expression patterns given a set of gene IDs for different tissues in one sample (Reference expression), or for different samples in the same tissue (Population expression). Both the gene set and the tissues/samples are clustered using the complete linkage method, and outputs an interactive heatmap layout.

2-7-1 Select panel

The query gene IDs can be separated by comma “,” or by newline. After inputted the query genes, check the target tissues/samples and click on Search to get the result.

2-7-2 Result panel

The resulted gene expression pattern is displayed as a heatmap.By default, both genes and tissues/samples were clustered in the heatmap (figure below). You could download the image, sort tissues/samples alphabetically and re-cluster tissues/samples using the control panels in in the figure below. Mouse over the cells of the heatmap shows the detailed expression level of the gene in the tissue ( in the figure below). Mouse click on the gene ID leads to the gene detailed information page.

2-8 Crispr

ZEAMAP provides a tabular layout to search for single-guide RNAs (sgRNAs) designed for CRISPR genome editing experiments, including CRISPR/Cas9 (with NGG PAM) and CRISPR/cpf1 (with TTV(A/G/C) and TTTV PAM). The sgRNAs were designed using CRISPR-Local, a local single-guide RNA (sgRNA) design tool for non-reference plant genomes. The sgRNAs can be filtered by the target gene IDs or their genomic regions:

and resulted tabular layout looks like this:

TODO: [A screen short for Crispr table]

with the meanings of each columns are listed below:

Column 1: The name of gene where the sgRNA located.

Column 2: The chromosome and the coordinate of the start position of the sgRNA.

Column 3: The sequence of sgRNA.

Column 4: The on-target score of the sgRNA. (There is no available scoring method for Cpf1 sgRNA, denoted by NA)

Column 5: The number of off-target sites.

Column 6: Type of match between sgRNA and off-target sites.(NM:no match found; U0:Best match found was a unique exact match; U1:Best match found was a unique 1-error match; U2:Best match found was a unique 2-error match… R0:Multiple exact matches found; R1:Multiple 1-error matches found, no exact matches; R2:Multiple 2-error matches found, no exact or 1-error matches.)

Column 7: The number of exact, 1-error, 2-error, 3-error and 4-error matches found.

Column 8: The gene and position in which exact match was found. (If there is no exact match, then denoted by NA)

Column 9: The name of exon where the sgRNA located(split by ;).

Column 10: The number that split by “:” means “TSS position”, “exon start position”, “length of exon”, “relative positon of sgRNA against exon” and “relative positon of sgRNA against TSS”, respectively.

Column 11: The highest off-target score between sgRNA and all off-target sites.(There is no available off-target scoring method for Cpf1 sgRNA, denoted by NA)

The sgRNAs can also be browsed through Jbrowse:

And mouse double-click on the sgRNAs leads to their detailed information:

3 Variations

ZEAMAP collects

3-1 Haplotype

pass

3-2 SNP

pass

3-3 INDEL

pass

3-4 SV

pass

4 Genetics

pass

4-1 GWAS

pass

4-2 eQTL

pass

4-3 Linkage

pass

5 Populations

pass

6 Evolutions

pass

6-1 Domestication

pass

6-2 Adaptation

pass

6-3 Improvement

pass

7 PAN_genome