Intro & Databases - Data Deluge Decoded
- Bioinformatics in Micro: Computational analysis of microbial biological data (DNA, RNA, proteins). Key for pathogen ID, resistance, evolution, drug discovery.
- š NCBI (GenBank host): National Center for Biotechnology Information - Your 'National Treasure' for bio-data!

| Database | Type | Focus | Key URL |
|---|---|---|---|
| GenBank | Primary | Nucleotide sequences (DNA/RNA) | ncbi.nlm.nih.gov |
| EMBL-EBI | Primary | Nucleotide sequences | ebi.ac.uk |
| DDBJ | Primary | Nucleotide sequences | ddbj.nig.ac.jp |
| UniProtKB | Secondary | Curated protein sequences & function | uniprot.org |
| PDB | Secondary | 3D structures (proteins & nucleic acids) | rcsb.org |
| InterPro | Secondary | Protein families, domains, functional sites | ebi.ac.uk/interpro |
Sequence Alignment - Sleuthing Sequences
-
Sequence Alignment: Arranging DNA, RNA, or protein sequences to identify regions of similarity.
- Pairwise Alignment: Compares two sequences.
- Multiple Sequence Alignment (MSA): Compares three or more sequences (e.g., Clustal Omega/W).
-
Key Terms:
- Homology: Shared evolutionary ancestry between sequences.
- Similarity: Percentage of aligned residues that are alike (conservative substitutions).
- Identity: Percentage of aligned residues that are identical.
-
BLAST (Basic Local Alignment Search Tool): Finds regions of local similarity. š BLAST types: 'Nucleotides Need Nucleotides (BLASTn), Proteins Prefer Proteins (BLASTp)'.
Type Query Sequence Database Sequence Use Case BLASTn Nucleotide Nucleotide DNA/RNA sequence similarity BLASTp Protein Protein Protein sequence similarity BLASTx Nuc (trans) Protein Finds potential proteins from DNA query tBLASTn Protein Nuc (trans) Protein query vs. translated Nuc DB tBLASTx Nuc (trans) Nuc (trans) Translated Nuc query vs. translated Nuc DB -
Significance Scores:
- E-value (Expect value): Number of alignments expected by chance. A lower E-value (e.g., < 1e-5) indicates a more statistically significant match (āsignificance).
- Bit Score: Normalized score reflecting alignment quality. Higher bit score = better alignment (āsignificance).
ā A lower E-value in BLAST results indicates a more statistically significant match, suggesting true homology rather than chance similarity.

Phylogenetic Analysis - Branching Out
Infers evolutionary relationships using molecular data.
- Markers: 16S rRNA (š '16S for Species Sleuthing!'), ITS regions, housekeeping genes.
- Tree Parts: Root (common ancestor), Node (divergence point), Branch (lineage), Clade (group with common ancestor), OTU (Operational Taxonomic Unit/taxon).
- Reliability: Bootstrap analysis values > 70% indicate strong support.

ā The 16S rRNA gene is a cornerstone for bacterial and archaeal phylogenetic studies due to its conserved and variable regions.
| Method | Principle | Pros | Cons |
|---|---|---|---|
| Distance-Based | Uses overall genetic distance | Fast, simple | Loses some sequence info |
| - UPGMA | Assumes constant molecular clock | Simple | Often unrealistic clock |
| - Neighbor-Joining | Minimizes total branch length | Good for large sets | Can be inaccurate |
| Character-Based | Evaluates changes at each site | More info used | Computationally intensive |
| - Max Parsimony | Fewest evolutionary changes | Intuitive | Prone to long-branch attraction |
| - Max Likelihood | Highest probability given model | Statistically robust | Model-dependent, slow |
| - Bayesian | Posterior probability of trees | Incorporates prior info | Complex, can be slow |
Genomics & Applications - Bugs to Drugs
- Microbial Genomics: Overview: genome annotation, comparative genomics.
- Metagenomics: Studying communities directly from environment (e.g., QIIME2, MG-RAST).
ā Metagenomics has revolutionized microbiology by enabling the study of previously unculturable microorganisms and their roles in complex ecosystems.
- Transcriptomics: Basics: microarrays, RNA-Seq for gene expression.
- Proteomics: Basics: Mass Spectrometry (e.g., Mascot, SEQUEST) for protein analysis.
Key Applications:
- Pathogen identification, outbreak tracing (epidemiology).
- Antimicrobial resistance (AMR) gene detection.
- Drug target discovery, vaccine development. š Bugs to Drugs.
'Omics' Technologies in Microbiology:
| Omics Type | Technology Examples | Key Application in Microbiology |
|---|---|---|
| Genomics | NGS, Sanger | Genome sequencing, annotation, comparison |
| Metagenomics | Shotgun seq, 16S rRNA | Microbial community study, unculturables |
| Transcriptomics | Microarrays, RNA-Seq | Gene expression profiling |
| Proteomics | Mass Spec (Mascot) | Protein ID, functional analysis |
HighāYield Points - ā” Biggest Takeaways
- BLAST: Core for sequence similarity searches and identifying homologs.
- Genome Annotation: Defines gene locations and functions in microbial DNA.
- Phylogenetic Analysis (e.g., 16S rRNA): Traces microbial evolution and outbreaks.
- Metagenomics: Studies complex microbial communities directly from samples, bypassing culture.
- NGS Data Analysis: Essential for variant calling, RNA-Seq (transcriptomics), and epidemiology.
- Key Databases: GenBank (NCBI) for nucleotide sequences, PDB for protein structures.
- Drug Discovery: Bioinformatics identifies novel antimicrobial targets and resistance mechanisms.
Continue reading on Oncourse
Sign up for free to access the full lesson, plus unlimited questions, flashcards, AI-powered notes, and more.
CONTINUE READING ā FREEor get the app