![]() |
|
Tuberculosis 2007 687 pages Download PDF, 8.3 MB Home Preface 1. History 2. Molecular Evolution 3. Clinical Bacteriology 4. Genomics and Proteomics 5. Immunology/Pathogenesis 6. Host genetics 7. Epidemiology 8. Other M. tuberculosis 9. Molecular Epidemiology 10. New Vaccines 11. Biosafety/Hospital Control 12. Diagnostic Methods 13. Immunological Diagnosis 14. New Diagnostic Methods 15. Tuberculosis in Adults 16. Tuberculosis in Children 17. Tuberculosis and HIV/AIDS 18. Treatment and Drugs 19. Drug Resistance 20. New Perspectives Comments and Suggestions Copyright Removal Disclaimer About Editors Juan Carlos Palomino Sylvia Cardoso Leão Viviana Ritacco Contributing Authors
|
Chapter 4: Genomics and Proteomics by Patricia Del Portillo, Alejandro Reyes, Leiria Salazar, María del Carmen Menéndez and María Jesús García
4.1. Impact of new technologies on Mycobacterium tuberculosis genomics A new wave in the analysis of the physiological secrets of microorganisms started more than a decade ago with the reading of the first complete genome sequence, corresponding to the bacterium Haemophilus influenzae (Fleishman 1995). Nowadays, the accessibility to hundreds of bacterial genome sequences has changed our way of studying the bacterial world, including bacterial pathogens such as M. tuberculosis. The overwhelming information displayed by genome sequences started the era of "omics" technologies. These technologies are in accordance to the currently fast times. A quick search in PubMed, limiting results to the last 10 years, showed more than 27,000 papers devoted to "omics" issues: more than three thousand concerning bacteria, and almost three hundred concerning Mycobacterium tuberculosis. Up to five different "omics" methodologies have been described so far, all concerning the global study of the target organism, analyzing all its genes, transcriptional products, proteins, etc.
In the tuberculosis (TB) field, only papers concerning genomics, transcriptomics, and proteomics have been published. Integration of data derived from the several "omics" by bioinformatics will probably allow a rational insight into M. tuberculosis biology and its interactions with the host, leading to true control of the disease. Undoubtedly, the biggest step in our knowledge on TB during the last decade was the description of the complete genome sequence of the laboratory reference M. tuberculosis strain H37Rv (Cole 1998a). For example, the identification of genes involved in the bacterial cell wall biosynthesis, the routes for lipid metabolism, the location of insertion sequences and the variability in the PE_PPE genes allowed scientists to merge the fragments of knowledge derived from the pre-genomic era in a more comprehensive way. The sequence of the genome, and its comparison to sequences of other microorganisms reported in several databases, allowed the assignation of precise functions to 40 % of the predicted proteins and the identification of 44 % of orthologues (genes with very similar functions in a different species), leaving 16 % as unique unknown proteins. The elucidation of complete genome sequences and the development of microarray-based comparative genomics have been powerful tools in the progress of new areas by the application of robotics to basic molecular biology. Comparative genomics and genomic tools have also been used to identify factors associated with the pathogenicity of M. tuberculosis, such as virulence factors and genes involved in persistence of the pathogen in host cells. Moreover, these tools allowed a description of the evolutionary scenario of the genus (see Chapter 2). more... (PDF) or
Download of the entire textbook
Structural genomics was the starting point. As more accurate technologies became available, the
interest was focused into functional genomics. Thus, information on specific mRNA actively
synthesized by bacteria inside macrophages or during in vitro starvation, opened ways to the
analysis of gene expression. Microarray technology was applied to the detection of global gene
activity in M. tuberculosis under several environmental conditions. However, bacterial function
cannot be understood by looking at the mRNA level alone. A major barrier for genomic studies has
been the great number of genes with unknown function that have been identified. Up to 60 % of the
open reading frames (ORFs) had unknown functions after the initial annotation of genes
(identification of the protein unrevealed by the corresponding ORF's amino acid sequence) (Cole
1998a). The elucidation of protein function was possible with the global analysis of bacterial
proteins, giving insights into the functional role of several so far unknown proteins. Thanks to the
joint contributions of biochemical techniques and mass spectrometry, up to 1,044 non-redundant
proteins were reported in different cellular fractions (Mawuenyega 2005). The upcoming task will be
to assign them all a functional role. As more results are obtained from the proteomic analysis, it
is expected that the function of more ORFs will be unveiled with the aid of new data on
transcriptomics and proteomics.
Genomics and other molecular tools allowed studies on gene expression and regulation, which were
unthinkable years ago. M. tuberculosis is a restricted human pathogen; therefore it must have
developed mechanisms enabling its quick and efficient adaptation to a variety of "intra-human"
environments, which are, in fact, its natural habitat. Understanding how the bacillus regulates its
different genes according to environmental changes will probably lead to the comprehension of many
interesting aspects of M. tuberculosis, including latency and host-adaptation.
This chapter will address the general basics, as well as the state-of-the-art genomics,
transcriptomics and proteomics in relation to M. tuberculosis. Finally, a general overview will be
made on lipids, the most peculiar metabolites of this bacterium.
4.2. M. tuberculosis genome
4.2.1. Genomic organization and genes
TB research made huge progress with the availability of the genome sequence of the type strain M.
tuberculosis H37Rv (Cole 1998a). Expectations were generated on the elucidation of some unique
characteristics of the biology of the tubercle bacillus, such as its characteristic slow growth, the
nature of its complex cell wall, certain genes related to its virulence and persistence, and the
apparent stability of its genome. This first available genome sequence of a pathogenic M.
tuberculosis strain helped to answer some of these questions and, what is even more stimulating, to
open many more. We describe herein the main characteristics of the M. tuberculosis genome sequences
completed thus far and highlight some of the most interesting questions answered and opened with
this advance in TB research.
M. tuberculosis H37Rv (Cole 1998a) was revealed to possess a sequence of 4,411,529 bp, the second
largest microbial genome sequenced at that time. The characteristically high guanine plus cytosine
(G+C content; 65.5 %) was found to be uniform along most of the genome, confirming the hypothesis
that horizontal gene transfer events are virtually absent in modern M. tuberculosis (Sreevatsan
1997). Only a few regions showed a skew in this G+C content. A conspicuous group of genes with a
very high G+C content (> 80 %) appear to be unique in mycobacteria and belong to the family of PE or
PPE proteins. In turn, the few genes with particularly low (< 50 %) G+C content are those coding for
transmembrane proteins or polyketide synthases. This deviation to low G+C content is believed to be
a consequence of the required hydrophobic amino acids, essential in any transmembrane domain, that
are coded by low G+C content codons.
Fifty genes were found to code for functional RNAs. As previously described (Kempsell 1992), there
was only one ribosomal RNA operon (rrn). This operon was found to be located at 1.5 Mbp from the
origin of replication (oriC locus). Most eubacteria have more than one rrn operon located much
closer to the oriC locus to exploit the gene-dosage effect during replication (Cole 1994). The
possession of a single rrn operon in a position relatively distant from oriC has been postulated to
be a factor contributing to the slow growth phenotype of the tubercle bacillus (Brosch 2000a).
One of the most thoroughly studied characteristic of M. tuberculosis is the presence and
distribution of insertion sequences (IS). Of particular interest is IS6110, a sequence of the IS3
family that has been widely used for strain typing and molecular epidemiology due to its variation
in insertion site and copy number (van Embden 1993, see Chapter 9). Sixteen copies of IS6110 were
identified in the genome of M. tuberculosis H37Rv; some IS6110 insertion sites were clustered in
sites named insertional hot-spots. The same strain was found to harbor six copies of the more stable
IS1081, an insertion sequence that yields almost identical profiles in most strains when analyzed by
Restriction Fragment Length Polymorphism (RFLP) (Sola 2001, Kanduma 2003). Another 32 different
insertion sequences were found, of which seven belonged to the 13E12 family of repetitive sequences;
the other insertion sequences had not been described in other organisms (Cole 1998b). Virtually all
the ISs found in M. tuberculosis so far belong to previously described IS families (Chandler 2002).
The only exception is IS1556, which does not fit into any known IS family (Cole 1999).
Two prophages were detected in the genome sequence; both are similar in length and also similarly
organized. One is the prophage PhiRv1, which in the M. tuberculosis H37Rv genome interrupts a
repetitive sequence of the family 13E12. This prophage is deleted or rearranged in other M.
tuberculosis strains (Fleischmann 2002). The genome of M. tuberculosis possesses seven potential att
sites for PhiRv1 insertion, which explains the variability of its position between strains (Cole
1999). The second prophage, PhiRv2 has proven to be much more stable, with less variability among
strains (Cole 1999).
Regarding protein coding genes, it was determined that M. tuberculosis H37Rv codes for 3,924 ORFs
accounting for 91 % of the coding capacity of the genome (Cole 1998a). The alternative initiation
codon GTG is used in 35 % of cases compared to 14 % or 9 % in Bacillus subtilis or Escherichia coli
respectively. This contributes to the high G+C bias in the codon usage of mycobacteria.
A bias in the overall orientation of genes with respect to the direction of replication was also
found. On average, bacteria such as B. subtilis have 75 % of their genes in the same orientation as
that of the replication fork, while M. tuberculosis only has 59 %. This finding has led to the
hypothesis that such a bias could also be part of the slow growing phenotype of the tubercle
bacillus (Cole 1999). This conjecture, however, does not take into account the fact that E. coli, a
bacterium that grows much faster than M. tuberculosis, has only 55 % of its genes in the same
direction as the replication origin (Li 2005).
From the predicted ORFs, all proteins have been classified in 11 broad functional groups (Table
4-1), more precisely classified into COG functional categories
(http://www.ncbi.nlm.nih.gov/sutils/coxik.cgi?gi=135) according to the National Center for
Biotechnology Information (NCBI) of the United States (US). The analysis of the codon usage showed a
preference for G+C-rich codons. It was also found that the number of genes that arose by duplication
is similar to the number seen in E. coli or B. subtilis, but the degree of conservation of
duplicated genes is higher in M. tuberculosis. The lack of divergence of duplicated genes is
consistent with the hypothesis of a recent evolutionary descent or a recent bottleneck in
mycobacterial evolution (Brosch 2002, Sreevatsan 1997, see chapter 2).
From the genome sequence it is clear that M. tuberculosis has the potential to switch from one
metabolic route to another including aerobic (e.g. oxidative phosphorylation) and anaerobic
respiration (e.g. nitrate reduction). This flexibility is useful for survival in the changing
environments within the human host that range from high oxygen tension in the lung alveolus to
microaerophilic/anaerobic conditions within the tuberculous granuloma. Another characteristic of the
M. tuberculosis genome is the presence of genes for synthesis and degradation of almost all kinds of
lipids from simple fatty acids to complex molecules such as mycolic acids. In total, there are genes
encoding for 250 distinct enzymes involved in fatty acid metabolism, compared to only 50 in the
genome of E. coli (Cole 1999).
Concerning transcriptional regulation, M. tuberculosis codifies for 13 putative sigma factors and
more than 100 regulatory proteins (see section 4.3 of this chapter).
Among the most interesting protein gene families found in M. tuberculosis are the PE and PPE
multigene families, which account for almost 10 % of the genome capacity. The names PE and PPE
derive from the motifs of Pro-Glu (PE) and Pro-Pro-Glu (PPE) found near the protein N-terminus in
most cases. These proteins are believed to play an important role in survival and multiplication of
mycobacteria in different environments (Marri 2006). There are about 100 members of the PE family,
which is further divided into three sub-families, the most important of which is the polymorphic
GC-rich sequences (PGRS) class that contains 61 members. Proteins in this class contain multiple
tandem repetitions of the motif Gly-Gly-Ala, hence, their glycine concentration is superior to 50 %.
The PE_PGRS proteins have been found to be exclusive to the M. tuberculosis complex (Marri 2006)
and resemble the Epstein-Barr virus nuclear antigens (EBNA), which are known to inhibit antigen
presentation through the histocompatibility complex (MHC) class I (Cole 1999).
Interestingly, the analysis of the desoxyribonucleic acid (DNA) metabolic system of M. tuberculosis
indicates a very efficient DNA repair system, in other words, replication machinery of exceptionally
high fidelity. The genome of M. tuberculosis lacks the MutS-based mismatch repair system. However,
this absence is overcome by the presence of nearly 45 genes related to DNA repair mechanisms
(Mizrahi 1998), including three copies of the mutT gene. This gene encodes the enzyme in charge of
removing oxidized guanines whose incorporation during replication causes base-pair mismatching
(Mizrahi 1998, Cole 1999).
With the aim of making the information publicly available and the search and analysis of information
easier, the Pasteur Institute (http://www.pasteur.fr/recherche/unites/Lgmb/) has created a database
system incorporating not only all genes and annotation but other search tools such as Blast or
FastA, that allow the user to search for homologue sequences of a query sequence inside the M.
tuberculosis genome. This database is freely available for use on the Internet and is known as the
Tuberculist Web Server http://genolist.pasteur.fr/TubercuList/).
As more information was generated, databases grew bigger, more experimental information became
available, and better and more accurate algorithms for gene identification and prediction were
released. The initial genome annotation in M. tuberculosis H37Rv strain soon became out of date. For
this reason, a re-annotation of that genome sequence was published in 2002. This re-annotation
incorporated 82 additional genes. The gene nomenclature was not altered; the new genes have the name
of the preceding gene followed by A, B or D, for example, two new ORFs were described between Rv3724
and Rv3725, hence, they were named Rv3724A and Rv3724B. The letter C was not included since it
usually stands for "complementary", which means that the gene is located in the complementary
strand. As expected, the classes that exhibited the greatest numbers of changes were the unknown
category and the conserved hypothetical category (Table 4-1). The re-annotation of the genome
sequence allowed the identification of four sequencing errors making the current sequence size
change from 4,411,529 to 4,411,532 bp (Camus 2002).
As shown in Table 4-1, the information obtained from a single sequenced genome is enormous. The
advances made on the analysis of such information have accelerated TB research.
Table 4-1: Functional classification of M. tuberculosis H37Rv and re-annotation*
Class Function Number of genes (1998) Number of genes (2002)
0 Virulence, detoxification, adaptation 91 99
1 Lipid metabolism 225 233
2 Information pathways 207 229
3 Cell-wall and cell processes 516 708
4 Stable RNAs 50 50
5 Insertion sequences and phages 137 149
6 PE and PPE proteins 167 170
7 Intermediary metabolism and respiration 877 894
8 Proteins of unknown function 606 272
9 Regulatory proteins 188 189
10 Conserved hypothetical proteins 910 1,051
* Data taken from Fleischman 2002
4.2.2. Comparative genomics
In recent times, new technologies have been developed at an overwhelming pace, in particular those
related to sequencing and tools for genome sequence data management, storage and analysis. As of
April 2007, 484 microbial genomes have been finished and projects are underway aimed at the
sequencing of other 1,155 microorganisms (http://www.genomesonline.org/gold.cgi). Mycobacteria are
not an exception in this titanic genome-sequencing race; since 1998, when the first mycobacterial
genome sequence was published (Cole 1998a); many genome projects have been initiated. Until April
2007, 34 projects on the genome sequencing of different mycobacterial species are finished or
in-process. Of these, 15 are directed towards M. tuberculosis strains, and 5 towards other members
of the M. tuberculosis complex. This information will be invaluable to improve the knowledge about
M. tuberculosis in the next few years. Currently, there are only two M. tuberculosis (H37Rv and
CDC1551) and two M. bovis (AF2122/97 and BCG Pasteur) genome sequences annotated and published. For
this reason, these are the strains that have been used as reference strains for comparative genomics
both in vitro and in silico.
The pioneer of in vitro assays of comparative mycobacterial genomics involved comparison of
restriction profiles using low frequency restriction enzymes and pulsed-field gel electrophoresis
(PFGE). These studies allowed a rough analysis of differences among M. bovis bacille Calmette-Guérin
(BCG) isolates (Zhang 1995) and most importantly, contributed to the construction of the first
physical maps, which were essential for the generation of the first genome sequence (Philipp 1996).
The next step in comparative genomics was the use of genomic subtractive hybridization or bacteria
artificial chromosome hybridization for the identification of regions of difference among the
strains under analysis (Mahairas 1996, Gordon 1999). Mahairas et al. (Mahairas 1996) used
subtractive hybridization to identify regions of difference that account for the avirulent phenotype
of the vaccine strain M. bovis BCG. As a result of their studies, they identified three regions of
difference (RD1-RD3) in the genome of M. tuberculosis H37Rv that appeared to be absent from M. bovis
BCG. Further studies of these regions showed that RD3 corresponded to the prophage PhiRv1, a
sequence that has been shown to vary among M. tuberculosis clinical isolates and laboratory strains
(see section 4.2.1). RD2 was only deleted in isolates of M. bovis BCG that were re-cultured after
1925. Finally, RD1 turned out to be the only sequence deleted from all M. bovis BCG strains and
present in pathogenic strains. However, complementation assays did not reconstitute the full
virulent phenotype in M. bovis BCG (Mahairas 1996). The RD1 region contains eight ORFs, including
members of the Early Secretory Antigenic Target 6 (ESAT-6) gene cluster (Brosch 2000a). The ESAT-6
proteins have been shown to act as potent stimulators of the immune system (Brodin 2002).The genome
of H37Rv contains 23 copies of ESAT-6 family proteins distributed in 11 different regions. Except
for esxQ, all are clustered in pairs belonging to the ESAT-6 and CFP-10 protein families (Stanley
2003, Gey Van Pittius 2001).
Gordon et al. (Gordon 1999) used ordered bacteria artificial chromosome arrays to determine genomic
differences between M. tuberculosis H37Rv and M. bovis BCG. As a result, they identified 10 regions
of difference, including the three previously described (Mahairas 1996). Interestingly, two of the
newly described regions (RD5 and RD8) also contained members of the ESAT-6 family of proteins. In
addition, RD5 contained three genes coding for phospholipase C, a gene with a putative role in
mycobacterial pathogenesis (Johansen 1996). Several members of the PE and PPE family proteins were
also found in the regions of difference. One copy of IS1532 was identified in RD6 and one copy of
IS6110 in RD5. Furthermore, the study searched for regions present in M. bovis BCG but absent from
M. tuberculosis H37Rv. Two regions with this characteristic were found and were named RvD1 and RvD2
standing for H37Rv Deleted. Almost all ORFs from these regions code for unknown proteins, so the
role of these deletions has not been elucidated.
Until 2002, most studies concerning comparative genomics were based on differences among the strain
type M. tuberculosis H37Rv and other tuberculous bacilli (Behr 1999, Brosch 1999, Brosch 2002).
Different approaches using DNA hybridization techniques, such as microarrays, allowed identification
of regions of difference with more accuracy and sensitivity than previous methodologies. In total,
16 regions of difference have been found in M. tuberculosis H37Rv that were deleted from M. bovis
BCG. The basic idea behind the identification of regions of difference between the avirulent strain
M. bovis BCG and the virulent laboratory strain M. tuberculosis H37Rv was the identification of
specific deletions in all BCG strains that could be responsible for their lack of virulence.
However, nine of the regions of difference were also absent in pathogenic isolates of M. bovis.
Other studies have been done comparing M. tuberculosis H37Rv to its avirulent counterpart M.
tuberculosis H37Ra (Brosch 1999), in which other Rv-deleted regions were identified. These regions,
named RvD3 to RvD5, were found to be products of homologous recombination of adjacent IS6110, as
with RvD2. Finally, only RD1 was found to be absent in all M. bovis BCG strains and present in other
members of the complex.
The regions of difference were used as markers of the molecular evolution of M. tuberculosis (Brosch
2002) and are represented in Figure 4-1. The use of deletions as molecular markers has been
described in Chapter 2.
Besides the above mentioned deletions, two duplications were identified in the M. bovis BCG genome
(Brosch 2000b). These duplications, named DU1 and DU2, apparently arose from independent events. DU1
seems to be restricted to the BCG Pasteur strain and comprises the OriC locus, indicating that BCG
Pasteur is diploid for OriC and some other neighboring genes. The DU2 region has been found in all
BCG substrains tested and includes the sigma factor sigH, which has been related to the heat-shock
response (Brosch 2001). Some excellent reviews are available on comparative genomics, made before
the publication of the second M. tuberculosis genome (Cole 1998a, Brosch 2000a, Brosch 2000c, Brosch
2001, Domenech 2001, Cole 2002a, Cole 2002b).
In 2002, the second M. tuberculosis genome sequence was completed, namely the clinical strain
CDC1551, which had been previously involved in a TB outbreak. This strain was considered to be
highly transmissible and virulent for human beings (Fleischmann 2002). With the sequence of this
second strain, a first approach to the bioinformatic analysis of intraspecies variability became
possible. In the initial comparison by sequence alignment, H37Rv presented a total of 37 insertions
(greater than 10bp) relative to strain CDC1551; from these, 26 affected ORFs while the remaining 11
were intergenic. On the other hand, CDC1551 presented 49 insertions relative to M. tuberculosis
H37Rv; 35 affecting ORFs and 14 intergenic. A total of 80 ORFs were inserted in either genome, 25
(31.2 %) of them were hypothetical or conserved hypothetical ORFs, while 36 (45 %) corresponded to
the family of PE/PPE proteins, showing the potential role of this family of proteins in antigenic
variability and thus in pathogenicity.
Deletion M. tuberculosis H37Rv M. africanum M. microti M. bovis M. bovis BCG
RD2
RD14
RD1
RD4
RD12
RD13
RD7
RD8
RD10
RD9
RvD1
TbD1
Figure 4-1: Distribution of deleted regions in M. tuberculosis complex members. Dark gray filled
cells indicate the presence in all strains tested, light gray indicate the presence in some strains,
white is absence from all strains tested. Data taken from (Gordon 1999, Brosch 2002, Brosch 2000b,
Marmiesse 2004)
Only one major rearrangement was found, consisting of the PhiRv1 (RD3),which was found in the genome
of M. tuberculosis H37Rv on coordinates 1,779,312 associated with a protein of the REP13E12 family.
On the genome of CDC1551, it was found to be located on the complementary strand at coordinates
3,870,803, also associated with a REP13E12 protein. M. tuberculosis CDC1551 was found to have four
copies of IS6110 while M. tuberculosis H37Rv had 16. Interestingly, four of the 16 IS6110 copies
found in M. tuberculosis H37Rv lacked the characteristic 3 to 4 base pair direct repeat and were
adjacent to regions deleted in M. tuberculosis H37Rv relative to M. tuberculosis CDC1551, which
suggests homologous recombination.
Since 2002, a large number of studies has been based on Large Sequence Polymorphisms (LSPs) and
Single Nucleotide Polymorphisms (SNPs), identified by the comparison of the first two M.
tuberculosis genome sequences (Hughes 2002, Gutacker 2002). These studies have been complemented
with data obtained from the genome sequence of a third organism of the M. tuberculosis complex. The
complete genome of Mycobacterium bovis AF2122/97, a fully virulent strain isolated from a diseased
cow in 1997 in Great Britain, was published in 2003 (Garnier 2003). This genome was composed of
4,345,492 bp with a G+C content of 65.63 %, 3,952 putative coding genes, one prophage (PhiRv2), and
four IS elements. As expected, similarity of more than 99.95 % was found with a complete
colinearity, without evidence of extensive rearrangements. With regard to LSP, most of them have
been described above as regions of difference. Sequencing confirmed the absence of 11 regions of
difference, and the presence of only one insertion in comparison to the sequenced M. tuberculosis
genomes: the region named M. tuberculosis specific deletion 1 (TbD1), is a reflection that deletion
events relative to M. tuberculosis have shaped the M. bovis genome. The comparison of the three
genomes reflects the high degree of conservation among the members of the M. tuberculosis complex,
as well as the divergence of M. bovis related to M. tuberculosis strains.
For specific proteins or genes that vary between M. bovis and M. tuberculosis, a detailed list can
be found in Garnier et al. (Garnier 2003). However, it is important to mention that the greatest
degree of variation among these bacilli is found in genes encoding cell wall components and secreted
proteins. Extensive variations have been found in genes of the PE/PPE family of proteins as well as
in genes from the ESAT-6 family, where six of the more than 20 members are absent or altered in M.
bovis. Some other changes are registered in genes coding for lipid synthesis and secretion as the
mmpL and mmpS family of genes. Deletions responsible for the M. bovis requirement of pyruvate as a
carbon source were also identified (Garnier 2003).
The analysis of the genome sequence of members of the M. tuberculosis complex has led to great
advances in the knowledge of the biology and pathogenesis of these bacteria. The sequencing of whole
genomes of Mycobacterium leprae (Cole 2001), Mycobacterium avium subspecies paratuberculosis (Li
2005) and of other members of the genus, such as Mycobacterium smegmatis and M. bovis, has also made
huge contributions to the understanding of the lifestyle of mycobacteria. Recently, a report
compared the metabolic pathways shared among five of the mycobacterial genomes that have been
sequenced (the genome sequence of M. smegmatis was not included on this report) (Marri 2006). The
characteristics of the sequenced genomes of organisms in the genus Mycobacterium are presented in
Table 4-2. The main differences were found in ISs, the PE/PPE gene family, genes involved in lipid
metabolism and those encoding hypothetical proteins. The members of the M. tuberculosis complex had
the highest number of IS elements, which might suggest higher intra-species variability in M.
tuberculosis compared to other species of mycobacteria.
Table 4-2: Features of sequenced genomes of species belonging to the Mycobacterium genus*
Feature M. tuberculosis H37Rv M. tuberculosis CDC1551 M. bovis AF2122/97C M. leprae M. avium
subsp. paratuberculosis M. smegmatis
Genome size (bp) 4,411,529 4,403,836 4,345,492 3,268,203 4,829,781 6,988,209
Protein coding genes 3,927 4,186 3,920 1,604 4,350 6,897
G+C (%) 65.6 65.6 65.6 57.79 69.3 67.40
Protein coding (%) 91.3 ~ 91 90.8 49.5 91.5 92.42
Gene density (bp/gene) 1,114 1,052 1,099 2,037 1,112 1,013
Average gene length 1,012 952 995 1,011 1,015 936
tRNAs 45 45 45 45 45 47
rRNA operon 1 1 1 1 1 2
*Data taken from Li 2005, Marri 2006
The comparison of the proteins encoded within the five sequenced genomes revealed a core, or a
number of shared proteins, of 1,326 proteins, compared to the 219 core genes described by macroarray
and bioinformatic analyses (Marmiesse 2004). Unique genes ranged between 966 (M. avium subsp
paratuberculosis) and 26 (M. tuberculosis H37Rv) depending on the genome, and most of these proteins
are hypothetical. Regarding the PE/PPE family proteins, it is worth mentioning that M. tuberculosis
and M. bovis contained the highest number of these proteins, while neither M. leprae nor M. avium
subsp paratuberculosis have PE_PGRS proteins. Also, a wide variation has been noted in the mmpL gene
family, known to participate in lipid transport and secretion. It has been proposed that these
variations could be involved in host specificity (Marsh 2005).
4.2.3. Comparing genomes of clinical strains of M. tuberculosis
Genome comparison has shown that gene content can vary between strains of M. tuberculosis. The
analysis of complete genome sequences identified SNPs, LSPs, and regions of difference (RDs) when
clinical isolates of M. tuberculosis were compared (Fleischmann 2002, Gutacker 2002, Tsolaki 2004,
Filliol 2006).
The microarray approach allows the comparison of a large number of genomes, providing information on
the diversity, frequency, and phenotypic effects of polymorphisms in the population (Tsolaki 2004).
This kind of genomic analysis is also useful for the investigation of outbreaks. Particularly when
applied to genomics, DNA microarrays allow the identification of sequences present in the M.
tuberculosis reference strain, but absent from different clinical isolates. Unfortunately, the
microarray technique cannot detect genes present in a clinical isolate that are absent in the
reference strain. These changes can originate from small deletions, deletions in homologous
repetitive elements, point mutations, genome rearrangements, frame-shift mutations, and multi-copy
genes (Ochman 2001, Schoolnik 2002). Fleischeman et al. suggested that genetic variation among M.
tuberculosis strains might denote selective pressure, and therefore might play an important role in
bacterial pathogenesis and immunity (Fleischmann 2002). Although associations between host and
pathogen populations seems to be highly stable, the evolutionary, epidemiological, and clinical
relevance of genomic deletions and genetic variation regions remain ill-defined, as do the molecular
bases of virulence and transmissibility (Hirsh 2004).
Up to six M. tuberculosis lineages adapted to specific human populations have been described by
Gagneux et al. using comparative genomics and molecular genotyping tools: the Indo-Oceanic lineage,
East-Asian lineage, East-African-Indian lineage, Euro-American lineage, and two West-African
lineages (Gagneux 2006, see chapter 2). Specific deletions associated with the hypervirulent
Beijing/W strains of M. tuberculosis were identified (Tsolaki 2005). Evidently, these differences
cannot include sequences present in clinical isolates that are absent from M. tuberculosis H37Rv, so
they necessarily represent a small part of the total potential genetic variability. Up to 13
complete genome sequences of representative M. tuberculosis clinical isolates are currently under
progress (http://www.genomesonline.org/gold.cgi). That number accounts for near half of all the
mycobacterial strains that are currently undergoing complete genome sequencing.
All major functional categories are represented among deleted genes in clinical isolates of M.
tuberculosis. Mobile genetic elements (insertion sequences or prophages) are frequently deleted. DNA
loss frequently results from the activity of the insertion sequence IS6110 (Brosch 1999, Gordon
1999). The rate of deletion in genes involved in intermediary metabolism and respiration, and in
cell wall synthesis is surprisingly high. Some of the genes encoding for potential antigens (plcA,
plcD, lpqH, lppA, esx, or PE/PPE genes) might be deleted under the influence of host selective
pressures, which would confer an adaptational advantage during infection or help transmission
(Tsolaki 2005). Some of these missing genes (e.g. esx genes) encode proteins from the ESAT-6 family
(Marmiesse 2004).
The use of microarray-based comparative genomics for the study of the genetic variability of
pathogens provides interesting information. Not only the identification of the deleted or absent
genes is important, but also the differential hybridization signal between samples is of interest.
These differential signals can indicate sequence divergence or a difference in the copy number,
which may provide an insight into strain evolution and pathogenesis (Taboada 2005).
4.2.4. Functional genomics
Functional genomics is the analysis of the biological function of the genes and their products
within a cell or organism. Unlike genomics and proteomics, functional genomics focus on gene
transcription, translation, and protein-protein interactions. Genes operate as long as they are
expressed and their expression is regulated at the transcriptional or post-transcriptional level.
Functional genomics uses mRNA expression profiling to provide a picture of the transcriptome in a
specific condition or time, in order to identify co-regulated genes that perform common metabolic
and biosynthetic functions. A set of co-regulated genes is known as a regulon.
Microarrays have additional applications in functional genomics apart from gene expression studies,
and other uses have also been reported. Using this new powerful technique, Sassetti et al. developed
a method to map transposon insertion sites in order to identify essential genes in mycobacteria. The
probes were synthesized from a transposon library and then used for hybridization in the array. A
number of mutants carrying an insert in each gene were obtained, which were later isolated and
identified (Sassetti 2001).
The results of studies on comparative mycobacterial genomics have been validated by functional
analysis, involving transcriptomics and proteomics. In fact, gene knock-out followed by transcript
analysis and proteome definition seems to be the way to identify essential genes. For example, M.
tuberculosis genes that encode functions essential for growth are prime choices for further
investigation as targets for the development of new drugs or diagnostic methods (Cole 2002b).
Subsequently, research derived from comparative genomic studies was directed towards the study of
particular genes. That is the case of the deletion designated as RD750 (corresponding to the Rv1519
gene) in the genome of the M. tuberculosis strain named CH, from a large outbreak that occurred in a
community of Indian immigrants in the United Kingdom (Rajakumar 2004) and belonging to the East
African-Indian lineage. Complementation and combination of in vitro and in vivo assay systems
indicated the participation of the gene Rv1519 in the persistence and outbreak potential of this M.
tuberculosis lineage in human populations (Gagneux 2006, Newton 2006).
Construction and transcription analysis of the appropriate mutant have revealed the functional role
of the Rv3676 gene, a member of the cyclic adenosine monophosphate (cAMP) receptor protein family of
transcription factors. This factor is required for virulence of M. tuberculosis in the mouse model.
The functional map obtained from the transcriptome revealed information about regulatory pathways.
The global transcription profiling experiments, comparing the wild type M. tuberculosis H37Rv strain
and Rv3676 mutant grown in vitro, identified some of the genes that are co-regulated, directly or
indirectly, by Rv3676 in M. tuberculosis (Rickman 2005).
4.3. Gene expression in M. tuberculosis
4.3.1. Control of gene expression
The ability of M. tuberculosis to survive within host cells requires a complex and tightly
controlled gene regulation. The genes that are used under different conditions could be readily
inferred from the corresponding mRNAs. Thanks to the development of highly specific and sensitive
technologies, such as microarrays and quantitative real-time Polymerase Chain Reaction (qRT-PCR), it
is now possible to analyze the global expression from both the bacillus and the infected host. Taken
together, all this could help us to understand the adaptative machinery of M. tuberculosis.
The deciphering of the complete M. tuberculosis genome sequence has unveiled its well-equipped
machinery, which accounts for its high degree of adaptability. Thirteen putative sigma (s) factors
and 192 regulatory proteins seem to be involved in the control of M. tuberculosis gene expression
(Cole 1998a). Interchangeable s factors regulate the function of RNA polymerase, initiating
transcription and conferring promoter specificity to the holoenzyme (Kazmierczak 2005, Mooney 2005).
To date, consensus promoter sequences have been proposed for six s factors, besides the housekeeping
s factor, sA (for a review, see Rodriguez 2006). Gene expression levels could be further modified by
the action of transcriptional activators and repressors: regulatory proteins (Barnard 2004). These
regulatory proteins include 11 two-component systems, five unpaired response regulators, seven wbl
genes, and more than 130 other putative transcriptional regulators (Cole 1998a).
The differential expression of these regulatory gene products throughout different stages of the
lifespan of M. tuberculosis must be determinant for the pathogen's successful infection and/or
persistence within the human host. In recent years, a number of reports have correlated the response
of several of these transcriptional regulators to a variety of environmental stresses (for a
summary, see Table 4-3 at http://www.tuberculosistextbook.com/pdf/Table 4-3.pdf), such as cold
shock, heat shock, hypoxia, iron or zinc starvation, nitric oxide, surface stress and oxidative
stress (Manganelli 1999, Raman 2001, Sherman 2001, Shires 2001, Manganelli 2002, Stewart 2002, Park
2003, Rodriguez 2003, Voskuil 2003, Canneva 2005, Geiman 2006). However, the biological signals that
stimulate the expression of the majority of them are still poorly recognized. Likewise, the
connections between the different regulatory circuits of the complex network that controls gene
expression in M. tuberculosis are incompletely established. An example of the intricacy of this
network is the genetic regulation of sigB, which is induced by sE in response to surface stress
(Manganelli 2001) or by sH under heat shock and oxidative stress (Manganelli 2002). The regulation
of sigB expression seems to be more complex than the above cited, given that sF- and sL-dependent
promoters were identified in the regulatory promoter region of sigB; and sL-dependent transcription
was originated upstream to sigB (Dainese 2006). It has been shown that sH is also responsible of the
induction of sigE after heat shock and exposure to diamine (Raman 2001). DNA microarray experiments
with M. tuberculosis mutants revealed that some s factors control the expression of their own
structural gene (Manganelli 2002, Geiman 2004, Sun 2004, Raman 2004, Dainese 2006). Autoregulation
has also been demonstrated for the six two-component systems studied so far in M. tuberculosis,
which are senX3-regX3 (Himpens 2000), trcRS (Haydel 2002), prrAB (Ewann 2004), dosRS (Bagchi 2005),
mprAB (He 2005), and phoPR (Gupta 2006). Two-component signal transduction systems are composed of a
histidine kinase sensor and a cytoplasmic response regulator that is activated by the cognate
histidine kinase (West 2001). One of these systems, dosRS, is induced by hypoxia, exposure to
ethanol or the nitric oxide donor S-nitrosoglutathione (Sherman 2001, Voskuil 2003, Kendall 2004).
This regulon is responsible for the transcriptional changes during oxygen limitation, which is
considered an important stimulus for the entry of M. tuberculosis into a dormant state (Wayne 2001).
For this reason, the genes included under control of dosRS are considered members of the dormancy
regulon. Recently, the induction of sigB and sigE has been shown to depend on the two-component
system MprA/MprB when the bacilli are subjected to surface stress (He 2006).
The transcriptional regulator WhiB3 seems to positively regulate the expression of the housekeeping
s factor named sigA, by interacting with the subregion 4.2 of s factor (Steyn 2002). WhiB3 is
encoded by one of the seven whiB-like genes described in the M. tuberculosis genome (Cole 1998a) and
belongs to the wbl family of genes, which encodes putative transcription factors, which are unique
to actinomycetes (Molle 2000, Soliveri 2000). A recent analysis has demonstrated that the expression
of M. tuberculosis whiB-like genes is modified in response to anti-mycobacterial agents and
environmental stress conditions (Geiman 2006). Additionally, whiB1 transcription is regulated by
cAMP levels via direct binding of the activated form of the product of Rv3676 (CRP protein-cAMP) to
a consensus site adjacent to the whiB1 promoter (Agarwal 2006).
A post-translational regulation has also been reported for several s factors. Antagonist proteins,
known as anti-s factors, can negatively regulate some s factors by sequestering them and preventing
their association with RNA polymerase. Many of these anti-s factors are located downstream of their
cognate s factor-encoding gene and both genes are usually co-transcribed (Bashyam 2004). The
functions of five specific anti-s factors of M. tuberculosis have so far been examined: RseA
(Rodrigue 2006); RshA (Song 2003); RslA (Hahn 2005, Dainese 2006); RsbW or UsfX (Beaucher 2002); and
RskA (Saïd-Salim 2006). Interestingly, RsbW, the sF-specific antagonist, is post-translationally
regulated by two anti-anti-s factors: RsfA and RskB (Beaucher 2002, Parida 2005).
Although the function of many of these mycobacterial transcriptional regulators and signal
transduction systems remains poorly defined, recent studies have begun to provide evidence of the
biological role of these regulatory circuits throughout each stage of the lifecycle of M.
tuberculosis inside the human host. The expression of sigA, sigE and sigG (Manganelli 2001, Capelli
2006, Volpe 2006), that of some two-component systems (Ewann 2002, Haydel 2004, Walters 2006), as
well as that of the transcriptional regulator whiB3 are induced during macrophage infection. The
role of these transcriptional regulators in pathogenesis and virulence became even more evident in
animal model experiments, where disruption or deletion of these genes was shown to affect M.
tuberculosis virulence in mice (Parish 2003a, Parish 2003b, Sun 2004, Manganelli 2004, Raman 2004,
Hahn 2005, Walters 2006).
Studies on mutagenesis and the expression profile of several regulators during the growth of M.
tuberculosis in the macrophages and in organs of experimental animal models are currently underway.
These regulators can modify bacterial physiology and are able to modulate host-pathogen interactions
in response to environmental signals.
4.3.2. In vitro gene expression
M. tuberculosis is an obligate mammalian pathogen that is able to infect many different cells,
including macrophages, dendritic cells, alveolar-epithelial cells, and neutrophils. It is also able
to reside extracellularly in the lung, inside granulomas. As mentioned previously, the tubercle
bacillus adapts its transcriptome to the environment in which it replicates. The adaptation of a
bacterium to harsh environments involves the transcriptional activation of genes whose final
products help the bacterium to reprogram its physiology, thus ensuring survival. Among the genetic
determinants that the bacterium must modulate are those involved in intermediary and secondary
metabolism, cell wall processes, stress responses and signal transduction pathways.
By utilizing the microarray technology, quantitative RT-PCR and laboratory generated mutants,
studies on M. tuberculosis global gene expression have been undertaken using broth cultures, cell
cultures, and animal models. None of these models reproduce several key features of TB in the human
infection; and, unfortunately, no data is available from human tissues.
Table 4-4 summarizes the most important genes whose expression is modulated by the transcriptional
regulators mentioned previously (see section 4.3.1).
Table 4-4: Regulation of cell process genes
Transcriptional regulation Condition Cell process genes
Up
regulated Down
regulated
sC Growth curve Two-component systems: senX3, mtrA, hspX (a-crystallin), fbpC (antigen 85C)
sD Exponential growth rpfC (resuscitation factor), Rv1815, Rv3413c PE_PGRS family genes
sH Diamine exposure, heat stressb Heat shock proteins: hsp, clp, trxB2C operon, transcriptional
regulators: sigE, sigB
sM Log-phase growth Esx family genes, PPE1, PPE19 PPE60, kasA-kasB, fas, pks2, pks3
sE Exponential growth, SDS exposure Icl1 (isocitrate lyase), heat-shock proteins,
transcriptional regulators: sigB, mprAB
sL sigL-rslA, pks10-pks7, mpt53-Rv2877c, Rv1139c-Rv1138c
sF Stationary-phase growth HP and CHP family of proteins, transcriptional regulators: sigC, sigF,
and MarA, GntR TetR family, cell envelope: murB
DosS-DosR Standing cultures hspX (a-crystallin), Rv3130c (CHP), Rv1738 (CHP), Rv0572c (CHP)
PhoPR Exponential growth Cell envelope components
MprA Mid-exponential phase ~95 genes of M. tuberculosis. PE/PPE gene family, HP, CHP
CHP = conserved hypothetical protein
For example, analysis of a mutant of M. tuberculosis sigC showed that this s factor induces the
expression of some virulence-associated genes. On the contrary, the genes hspX (encoding the
a-crystalline homologue), senX3 (sensor kinase), mtrA (response regulator), and fbpC (mycolyl
transferase and fibronectine binding protein or antigen 85C) were down-regulated in that mutant
strain at different times of the growth curve (Sun 2004). Genes induced by sD include the
resuscitation promoting factor rfpC, several chaperone genes and genes involved in lipid metabolism
and cell wall processes (Raman 2004). Thirty-nine genes were shown to be under the control of sH.
These include genes coding for some heat shock proteins (hsp and clp), the trxB2C operon and some
transcriptional regulators (Manganelli 2002). Quantification of mRNA by primer extension under
different stresses demonstrated that the transcription of trxB2, dnaK, clp and sigE could be induced
from sH-dependent promoters located upstream of these genes (Raman 2001). sE seems to regulate the
expression of proteins involved in fatty acid degradation, such as the isocitrate lyase (coded by
icl1), two proteins related to fatty acid degradation (fadE23 and fadE24), heat shock proteins, and
the transcriptional regulators sigB and mprAB (Manganelli 2001). Recently, it was reported that sM
induces the expression of two pairs of secreted proteins of the Esx family and two PPE genes, while
it negatively regulates PPE60 expression as well as the expression of several genes involved in
surface lipid biosynthesis and transport (Raman 2006).
At least four small operons appear to be directly regulated by sL: sigL-rslA, pks10-pks7,
mpt53-Rv2877c, and Rv1139c-Rv1138c, which clearly have a sL-consensus promoter sequence in their
regulatory region (Hahn 2005, Dainese 2006). The pks genes are involved in the biosynthesis of
phthiocerol dimycocerosate, a component of the cell envelope associated with virulence (Sirakova
2003); and the mpt operon contains genes involved in fatty acid transport (Sonden 2005). DNA
microarrays of a mutant of M. tuberculosis lacking a functional sigF, have revealed that this s
factor is able to induce gene expression almost exclusively during the stationary phase of growth,
supporting the hypothesis of a major role of sF in the adaptation to the stationary phase. Among the
sF-targeted genes, 50 % coded for hypothetical proteins or proteins of unknown function, some were
transcriptional repressor/activators (MarR, GntR and TetR family of DNA binding regulators) and
others were found to be involved in the biosynthesis and structure of the cell envelope (Geiman
2004).
A complete genomic microarray analysis has also been performed on M. tuberculosis strains mutated in
two-component regulatory systems. The dormancy-related two-component system dosRS was inactivated in
M. tuberculosis using different methodologies (Parish 2003b, Park 2003). It was shown that the
expression of this two-component system is highly induced under hypoxia (Sherman 2001b, Park 2003).
A consensus dosR-specific binding motif was reported to be located upstream of hypoxic response
genes (Park 2003, Kendall 2004). The microarray expression profiles of mutants in each of the
components (dosR and dosS) showed that DosR is required for the expression of genes usually induced
under oxygen limitation, such as hspX gene. Several putative operons with unknown function were also
strongly regulated by DosRS.
Up to 30 genes were found to be up-regulated, and another 68 genes down-regulated in a mutant of M.
tuberculosis senX3-regX3. However, it has not been clearly determined if the changes found in gene
expression were directly or indirectly related to the lack of this two-component regulatory system
(Parish 2003a). Recently, the global transcriptional profile of the two-component systems PhoP and
MprA has been reported. One of these studies provided evidence that the PhoP/PhoR system is a
positive transcriptional regulator of genes involved in the synthesis of the cell envelope of M.
tuberculosis (Walters 2006). On the other hand, MprA regulates sigB and sigE and many other genes
previously reported to be associated to various stress conditions (He 2006).
In order to analyze the mechanisms involved in bacilli intracellular survival, mycobacterial gene
expression was determined in M. tuberculosis infected macrophages from different sources.
Macrophages play a crucial role in TB infection because they represent both the effector cells for
bacterial killing and the primary habitat in which the persisting bacilli reside. Macrophages have
been investigated at different time points post-infection for the differential expression of various
two-component system regulators (regX3, phoP, prrA, mprA kdpE, tcr, devR and tcrX) (Haydel 2004).
More recently, the gene expression profile of M. tuberculosis grown in human macrophages compared to
that of bacteria growing in synthetic culture medium was published (Capelli 2006). In this work, the
authors reported that approximately one-third (32 %) of the genes upregulated by M. tuberculosis in
macrophages correspond to conserved hypothetical proteins, with unknown function; this finding
highlights the considerable gap that still remains in the knowledge of how this bacterium survives
intracellularly. Genes involved in cell wall processes (19.5 %), regulation and information pathways
(16 %), and PE family proteins (3.6 %) were also upregulated. Interestingly, the authors observed
high induction of the sigma factor sigG and 13 other putative transcriptional regulators.
Upregulation of sigA, sigE, and sigG was also reported in a similar study (Volpe 2006). The whiB3
gene was also induced in M. tuberculosis during infection of naïve bone marrow-derived macrophages
in comparison to bacteria in broth culture mid-log growth (Banaiee 2006).
4.3.3. In vivo gene expression
The use of microarrays for profiling transcriptomes of bacteria inside the host cell has been
limited by the paucity of bacterial mRNA in samples containing a preponderance of mammalian RNA.
Therefore, while significant work has been performed on the gene expression profile of the host,
information on M. tuberculosis expression inside infected hosts is still limited.
So far, there is only one publication concerning global mycobacterial transcription expression in
the animal model, using microarray as the analytical method (Talaat 2004). Differential expression
levels of M. tuberculosis during infection in Balb/c or Severe Combined Immunodeficiency (SCID) mice
were evaluated and compared to the levels found in mycobacteria grown in broth culture. These
authors identified up to 40 genes whose expression significantly changed during Balb/c and SCID
mouse infection. These genes include rubB, dinF, and fdxA. The same genes were also found to be
induced 24 hours post-infection in murine bone marrow macrophages (Schnappinger 2003). Additionally,
several genes were regulated up or down only in Balb/c mice, such as proZ (transport system permease
protein), aceAa (probable isocitrate lyase involved in lipid metabolism), and genes encoding for
regulatory proteins, such as sigK, sigE and kdpE. The authors concluded that the expression profile
of M. tuberculosis in SCID mice resembles the profile found in bacilli grown in vitro, while the
expression profile in Balb/c mice resembles that reported in multiplication within the macrophages
(Schnappinger 2003). Exceptionally, some genes were found to be expressed only in Balb/c mice.
A small number of studies applied quantitative RT-PCR to investigate the expression of mycobacterial
genes in the animal model. These studies focused on the analysis of a few particular genes.
Examination of lungs of infected C57BL/6 mice showed that the transcriptional regulator genes whiB3,
fdxA (electron transfer), hspX (a-crystalline), acg (unknown function), Rv1738 (unknown function),
and Rv2626c (unknown function) were markedly induced during the course of infection (Banaiee 2006).
A gene required for extrapulmonary dissemination (hbhA) was also upregulated in the lung but not in
the spleen during the early stages of infection (Delogu 2006). While the expression of PE_PGRS16
was up-regulated in the spleens and lungs of infected mice, the expression of PE_PGRS26 was
down-regulated (Dheenadhayalan 2006). A study on human lung biopsies revealed a high variability in
expression profiles of specific M. tuberculosis genes among the specimens analyzed (Timm 2003). The
biopsies were obtained from four HIV-negative patients with chronic active TB that was unresponsive
to therapy. Although some differences were observed when comparing human and murine lung, the
authors admitted that it was difficult to ascertain whether the infection stage in the analyzed
human lung specimens could be correlated with the persistent infection in mice.
4.4. M. tuberculosis proteome
With the availability of the genomes of M. tuberculosis H37Rv, M. tuberculosis CDC1551, M. bovis,
and the ongoing sequencing projects, attention in the coming years must be focused on the
interpretation of the sequences determining the structure and function of the proteins. Proteomics,
the global study of proteins that are translated in a given physiological state is one of the most
important and ambitious goals in M. tuberculosis research. The proteome of an organism implies not
only an inventory of its gene products but also the transduction rate and the post-transcriptional
events that occur in the organism (Betts 2002). Classical studies of proteomics involve two
dimensional electrophoresis (2-DE), in which proteins are first separated by the isoelectric point
and then by the molecular weight (O'Farrel 1975). Every spot of protein is then isolated, hydrolyzed
and subjected to techniques of mass spectrometry (MS), tandem MS (MS/MS); matrix-assisted laser
desorption/ionization mass spectrometry (MALDI/MS) and, matrix-assisted laser desorption/ionization
time of flight mass spectrometry (MALDI-TOF/MS). For a good review on the different techniques used
in protein mapping, readers are referred to Patterson et al (2000). Techniques different from two
dimensional electrophoresis have also been implemented. For instance, the use of one dimension
electrophoresis has been shown to be very useful for the separation of hydrophobic proteins (Simpson
2000). Other approaches that do not involve the use of gels, such as two-dimensional liquid
chromatography (LC) and the subsequent analysis by MS (2 DLC/MS), have been shown to be very
efficient in the identification of hydrophobic and membrane proteins (Isobe 1991). In 1999, the
isotope-coded affinity tag (ICAT) technology was reported (Gygi 1999). In this, mixtures of proteins
from bacteria in two different conditions are covalently labeled with isotopically labeled heavy or
light forms of the reagents. The samples are combined and subjected to proteolysis. After
purification of the labeled peptides through affinity tag, which is part of the reagents, they are
analyzed by LC-MS/MS. This new technology has proven to be very useful in the quantitation of
complex mixtures of proteins.
Before the disclosure of the M. tuberculosis genome, antigens and proteins were identified by one-
and two-dimension polyacrilamyde gel electrophoresis, and the use of cumbersome immunological
methods (Nagai 1991, Garbe 1996). With the advance in high-resolution 2-DE and analytical chemistry,
M. tuberculosis proteome is at present a reality. Pioneering studies in the proteomic field included
the mapping of 32 N-terminal sequences by MS and the identification of culture filtrate proteins by
2-DE and immunodetection (Sonnenberg 1997).
Very soon after the publication of its genome, bioinformatic tools were applied to predict the
proteomic profile of M. tuberculosis (Tekaia 1999). The in silico analysis showed characteristics of
the tubercle bacillus as the duplication of numerous genes, especially those involved in gene
regulation and in lipid metabolism, and those coding for the PE/PPE protein family. The study also
showed a reduced repertoire of proteins devoted to transport, which might reflect the intracellular
lifestyle. The bioinformatics-predicted proteome was compared to 2-DE protein maps obtained from M.
tuberculosis whole cell lysates, which were separated in a broad pH range between 2.3 and 11.0. This
work demonstrated that proteins with a molecular mass below 10 kDa were not predicted from the
genome sequence and also that experimentally basic and high molecular mass proteins could not be
resolved by 2-DE (Urquhart 1998). Since 1999, the huge amount of data in proteomics has led to the
creation of 2-DE databases, where images generated in different laboratories can be stored and
analyzed. These databases are accessible on the internet at:
http://web.mpiib-berlin.mpg.de/cgi-bin/pdbs/2d-page/extern/overview.cgi?gel=16 (Mollenkopf 1999) and
http://www.ssi.dk/sw14644.asp (Rosenkrands 2000b).
4.4.1. Structural proteomics of M. tuberculosis
Thanks to recent technological advances, the subcellular protein profile of M. tuberculosis can now
be drawn. The global analysis of compartmentalized proteins will shed light on host-pathogen
interactions, metabolic pathways and cell communication, just to mention some of the mechanisms
related to pathogenesis. In addition, many pathogenic bacteria secrete proteins that are involved in
virulence (Finlay 1997) and thus culture filtrates of M. tuberculosis could be a source for
identification of virulence factors. Cell wall proteins play a fundamental role in cell
architecture, resistance of the pathogen to chemical injury and dehydratation, and many other key
functions of this microorganism. Thus, the identification of proteins localized in this subcellular
fraction may lead, in the near future, to the development of new diagnostic tests and drugs.
Membrane proteins demand special attention, because they are involved in host-pathogen interactions,
nutrient transport, quorum sensing mechanisms, etc. Knowledge of these proteins could be the clue to
the development of novel vaccines. Finally, the identification of cytosol proteins and the intricate
network of their interaction will reveal metabolic pathways that can be targets for the design of
rational drugs against TB. Even though we are still far from identifying the almost 4,000 genes
predicted by genomics, the number of identified proteins increases each year and shows how genomic
and proteomic technologies complement each other.
The biochemical methods developed for the separation of the cell wall, membrane and cytosol
fractions have facilitated proteomic studies in M. tuberculosis (Hirschfield 1990, Lee 1992).
Jungblut et al. identified 53 proteins from cell lysates and 54 from culture filtrates using 2-DE
and MALDI/MS (Jungblut 1999). These authors performed comparative proteomics in M. tuberculosis,
which will be discussed later in this chapter.
Reference maps of cellular fractions and culture filtrate proteins were constructed using 2-DE,
N-terminal sequencing and antibodies against previously identified antigens (Rosenkrands 2000b). As
many as 1,184 spot proteins were visualized after silver staining. Only 10 % of them were
identified. In order to map less abundant proteins, different methods were applied for their
separation, which allowed the identification of 12 novel proteins, five of them with a known
function (Rosenkrands 2000b). The study also showed the identification of a protein that was not
predicted by genomics and revealed the presence of alternative start codons.
The implementation of immobilized pH gradient for 2-DE and MALDI/MS allowed the identification of
288 proteins (Rosenkrands 2000a). Six proteins were identified, all of them with molecular masses
between 13,200 and 7,200 kDa and with isoelectric point (pI) ranges between 4.5 and 5.9. Five of
these proteins were correctly identified in the genome of the clinical strain CDC1551 (Jungblut
2001).
In spite of the enormous usefulness of 2-DE in proteomic studies, there are certain disadvantages
inherent to its technique, such as the low resolution of proteins with very high or very low
molecular masses, or proteins that are very acidic, very basic or hydrophobic. But in particular,
the technique is biased towards the preferential identification of the most abundant proteins.
Therefore, less abundant proteins, such as transcriptional regulators, are rarely detected when
whole cell lysates are analyzed (Gygi 1999). To overcome these inconveniences, alternative
techniques have been applied in proteomic studies. For example ICAT reagent method and LC-MS/MS were
used as a complement of 2-DE-MS/MS. Using these approaches, 388 M. tuberculosis proteins were
quantified and identified. Each one of these techniques has been shown to be adequate for the
identification of certain classes of proteins. For example, the ICAT method performed better for the
identification of cell membrane and high molecular mass proteins, while 2-DE showed better results
in the identification of low molecular mass and cysteine-free proteins (Schmidt 2004).
Interestingly, none of these techniques allowed the identification of proteins in the following
subclasses: cell division, IS elements, repeated sequences, phages, PE/PPE families, cytochrome P450
enzymes, cyclases, and chelatases.
By 2004, only about 400 proteins had been identified in the proteome of M. tuberculosis, probably
due to the limitations of 2-DE-based separation methods. The use of automated two-dimensional,
capillary high-performance liquid chromatography (HPLC) coupled with MS gave a wider and more
accurate proteomic profile of M. tuberculosis (Mawuenyega 2005). Proteins of the cell wall, membrane
and cytosol subcellular fractions could be identified. The number of identified proteins increased
to 1,044 non-redundant proteins, 67 % more than those obtained by conventional 2-DE. This study
identified proteins in extreme pI ranges, among them the most acidic proteins ( PE_PGRS, Rv3512)
with a pI of 3.89 and the most basic proteins (rps2, a 30S ribosomal protein) with a pI of 12.18.
Proteins of high molecular mass, such as the 230,621 Da polyketide synthase ppsC, were also
identified. A total of 705 proteins were identified in the membrane, 306 were localized in the cell
wall, and 356 in the cytosol fraction. Forty-seven were present in all analyzed fractions. The study
also included a computational analysis of protein networks, one of the most exciting fields in the
coming years. Readers are invited to consult the supplementary table of this work (Mawuenyega 2005).
M. tuberculosis is an intracellular pathogen, the bacillus is engulfed by alveolar macrophages where
it can survive and grow by altering the intracellular compartments to preclude the normal maturation
to phagolysosomes or to prevent fusion of phagosomes to lysosomes (Clark-Curtiss 2003). The
interaction between host and pathogen is thought to be mediated by membrane proteins. Therefore, the
characterization of membrane proteins is a topic of intensive research. As mentioned before, most of
the studies regarding M. tuberculosis proteomics have been carried out by 2-DE. However, the number
of membrane and membrane-associated proteins has been underestimated by the 2-DE technology due to
the hydrophobic nature of this class of proteins and their low solubility. In order to overcome
these problems, fractions of cellular membranes were prepared by differential centrifugation and
separated by one-dimensional electrophoresis. The separated bands were then excised and hydrolyzed
prior to LC and MS/MS (Gu 2003). This approach allowed the identification of up to 739 membrane and
membrane-associated proteins. Very hydrophobic proteins, including those with 15 transmembrane
helices, were detected in this study. The use of alternative solubilizing agents, such as Triton
X-114, has proven to be a good choice for membrane fractionation. The detergent was shown to be
useful in the identification of nine novel proteins that have been already incorporated in the M.
tuberculosis proteome (Sinha 2005). Interestingly, when analyzing the interferon-gamma (IFN-g)
response of BCG-vaccinated healthy individuals from an endemic area to these newly identified
proteins, the strongest response was found to be that against ribosomal proteins. Other membrane
associated proteins, such as ESAT-6, did not contribute significantly to the T-cell response in
these individuals.
4.4.2. Comparative proteomics
The comparative proteomic analysis using 2-DE and MALDI/MS was applied to compare proteins present
in two virulent laboratory M. tuberculosis strains (H37Rv and Erdman strains) with those present in
two M. bovis BCG strains (Chicago and Copenhagen BCG strains) (Jungblut 1999). The results showed
that, as expected, the two M. tuberculosis strains differed from each other in only a few proteins.
Of the 18 variant proteins, 16 were identified. L-alanine dehydrogenase (Rv2780) was not detected in
the Erdman strain, and the protease IV was absent in this strain. On the other hand, the Soj protein
and the hypothetical protein Rv2641 were absent from the M. tuberculosis H37Rv proteome. Some of the
18 proteins were over-expressed in one or the other strain, and some shifted their mobility probably
due to the presence of amino acid substitutions. The comparison of M. tuberculosis H37Rv with M.
bovis BCG revealed the presence of 13 protein spots exclusive to the tubercle bacilli, six of which
were identified. The differential proteins comprised L-alanine dehydrogenase (40 kDa protein),
isopropyl malate synthase nicotinate-nucleotide pyrophosphatase (Rv1596), MPT64 (Rv1980c), and two
hypothetical conserved proteins (Rv2449c and Rv0036c). On the other hand, M. tuberculosis H37Rv
lacked eight spots compared to M. bovis BCG.
In another study using 2-DE and MS, a comparison of the proteins present in M. tuberculosis and M.
bovis BCG revealed the presence of 56 unique protein spots in M. tuberculosis and 40 in the
attenuated strain BCG (Mattow, 2001). Of these, 32 were identified as exclusive proteins of M.
tuberculosis, of which 12 had been previously reported to be deleted in M. bovis BCG. The remaining
20 spots were newly identified as absent from M. bovis BCG.
A third comparative proteomic study of M. tuberculosis and M. bovis BCG was performed using 2-DE and
ICAT technology (Schmidt 2004). This work demonstrated the presence of only three exclusive proteins
in M. tuberculosis H37Rv. One is Rv0223c, a protein belonging to the aldehyde dehydrogenase family.
The second is Rv0570, a ribonucleotide reductase class II. The third is a hypothetical protein named
Rv1513.
The studies on comparative proteomics allowed the identification of isopropyl malate synthase
exclusively in the M. tuberculosis proteome. Recently, this enzyme was included in a new class of
virulence factors known as 'anchorless adhesins' (Kinhicard 2006) that were absent from the
avirulent BCG, thus proving the usefulness of this methodology.
Of special interest in the coming years will be the proteomic comparison between circulating M.
tuberculosis strains differing in virulence, transmissibility, tissue tropism, and/or ability to
acquire drug resistance. As a matter of fact, the proteomic profile of M. tuberculosis H37Rv has
already been compared with that of the clinical strain CDC1551 at different time points during in
vitro growth (Betts 2000). Subscribing the low structural DNA polymorphism observed in M.
tuberculosis (Sreevatsan 1997), the resulting patterns of the protein-spot were found to be both
highly reproducible and highly similar between the two strains during growth. One unique protein was
identified in M. tuberculosis CDC1551, namely Rv0927c, a probable alcohol dehydrogenase. Similarly,
a spot corresponding to the HisA protein, which is involved in the histidine biosynthetic pathway,
was detected in the M. tuberculosis H37Rv proteome but was absent from the M. tuberculosis CDC1551
protein profile. Oddly enough, both genes were found to be present in both M. tuberculosis strains.
Thus, the described proteomic differences between H37Rv and CD1551 might be ascribed to
post-translational events or to degradation during the manipulation of the specimens. Another
interesting feature in the same study was the mobility variations of the transcriptional regulator
MoxR, which the authors attributed either to amino acid changes or to post-translational
modifications. A BlastP analysis of both genomes showed a single amino acid substitution of
histidine in M. tuberculosis H37Rv to asparagine in M. tuberculosis CDC1551 that might explain the
variation in mobility.
Transcriptional regulation differences between strains might be the key to understanding how
virulence factors are involved in a variety of roles, including host-cell invasion, survival within
the host cell, and long-term persistence. Therefore, comparative proteomic studies are of special
interest in the post genomic era, helping to understand the manifestation of disease produced by
different strains involved in the current TB epidemic.
4.4.3. Environmental proteomics
The information obtained by genomic studies is static, because DNA is not essentially affected by
the environment. In contrast, the proteomic profile of an organism in a particular physiological
situation complements and helps to decipher its interaction with the environment. The study of the
M. tuberculosis proteome in different physiological states is one of the most fascinating fields of
research. Being an intracellular pathogen, the bacillus is challenged by a variety of environmental
changes. Inside the mammalian macrophage, the microorganism is subjected to a series of different
weapons. Inside the granuloma, it faces low oxygen tension, starvation, low pH, reactive nitrogen,
and reactive oxygen species, among other offenses (Schnappinger, 2006). The bacillus has developed
adaptive mechanisms for survival and persistence in these hostile environments. The identification
of proteins expressed under such conditions is a matter of demanding research.
M. tuberculosis has another outstanding characteristic: it is able to persist for years in its host
causing latent infection, in a state known as dormancy. The mechanisms governing this state are
still not fully understood and the protein expression profile in models mimicking the dormant state
is an issue of intense research. Different in vitro models have been developed, aimed at simulating
the in vivo conditions inducing dormancy (Wayne 1996, Betts 2002, Voskuil 2003). In the Wayne's
model, which has been applied in proteomic studies, the level of oxygen is gradually depleted due to
bacterial growth, defining two non-replicating stages: a microaerophilic stage NRP-1
(non-replicating persistance-1), followed by an anaerobic stage NRP-2. Still, the evidence linking
human M. tuberculosis latent infection with M. tuberculosis dormant stages attained in vitro remains
merely circumstantial.
Hypoxia is among the most conspicuous conditions encountered by the tubercle bacilli in the central
part of the granuloma, where bacilli are considered to remain dormant. The hypoxic response of M.
tuberculosis in the Wayne's model was investigated by performing a proteomic study of cell lysates
and culture filtrates (Rosendkrands, 2002). The comparison of the protein content between aerobic
and anaerobic cultures identified up to seven proteins that were more abundant in hypoxic
conditions. The main proteins characterized were fructose biphosphate aldolase (in culture filtrate
only), hypothetical protein Rv0569, and alpha-crystallin protein, also known as HspX. Other proteins
identified included hypothetical proteins Rv2623 and Rv2626c, L-alanine dehydrogenase (only in
culture filtrates), and BfrB, a bacterioferritin involved in iron uptake and storage.
Using a modified Wayne's model, Stark et al. (Stark 2004) visualized 13 unique and 37 more abundant
spots under hypoxia, revealed by 2-DE and MALDI-TOF/MS. Of these 50 spots, 16 proteins were
identified, including some that had not been previously detected such as GroEL2, KasB, Ef-Tu, ScoB,
TrxB2, and CmaA2. Among the hypothetical proteins found were Rv2005c, with similarity to universal
stress proteins, Rv0560c, Rv2185c, and Rv3866.
Applying the ICAT technology to the comparison of the physiological NRP-1/NRP-2 states versus active
growth (logarithmic phase), the number of newly identified proteins increased up to 875 (Cho 2006).
A total of 586 proteins were identified and quantified in the microaerobic stage of nonreplicating
persistence stage (NRP-1), 628 others were detected in the anaerobic dormancy state (NRP-2), and 339
were common to both non-replicating persistence stages. Proteomic comparison between the NPR-1 and
the logarithmic phase of growth showed that 6.5 % of the proteins were up-regulated in NRP-1, while
20.4 % of the proteins were up-regulated in NRP-2. The analysis of the proteomic profile showed that
the NRP-1 state displayed a significant increase in proteins involved in small molecule degradation
and the NRP-2 state a significant increase in energy metabolism, altogether suggesting an adaptive
mechanism of M. tuberculosis to enter into the anaerobic environment.
M. tuberculosis pathogenicity is directly associated with its ability to establish invasion and
division inside the host's macrophages, despite the antimicrobial properties of these cells. The
study of the M. tuberculosis proteome in this physiological environment is a crucial step towards
understanding the mechanisms involved in its pathogenicity. In spite of the enormous advances in
biochemical analytical techniques, the purification and identification of proteins is not always an
easy task. In a recent paper, Mattow et al. (Mattow 2006) applied subcellular fractionation of
infected murine bone marrow-derived macrophages in combination with high-resolution 2-DE and MS/MS
to analyze the proteome of M. tuberculosis inside the phagosome. The proteome was compared with that
derived from broth cultures. Using this approach, 121 unique spots were detected in intra-phagosomal
mycobacteria. Only 11 of them were identified as M. tuberculosis proteins, the remaining 110 were
identified as murine proteins. The 11 identified proteins were: Rv1240 (Malate dehydrogenase, MdH),
Rv1077 (Cystathione (beta)-synthase, CysM2), Rv3396c (GMP synthase, GuaA), Rv0489 (Phosphoglycerate
mutase I, Gpm), Rv2773c (Dihydrodipicolinate reductase, DapB), Rv0009 (Peptidyl-prolyl cis-trans
isomerase, PpiA), Rv1627c (lipid carrier protein), Rv2961 (putative potassium uptake protein, TrkA),
and hypothetical protein Rv1130, which was detected in two separates spots, Rv0428c, and Rv1191.
Some of these proteins (CysM2, DapB, GuaA, MdH and PpiA) had previously been detected using 2-DE
patterns of whole cell lysates and/or culture supernatants of M. tuberculosis H37Rv, indicating that
they are not exclusive of the phagosomal millieu.
Proteomic studies seem to be a successful way to discover new virulence factors, drug target
molecules and proteins involved in pathogenic mechanisms. Thousands of proteins have now been
identified and many more await identification.
4.5. An insight into M. tuberculosis metabolomics
4.5.1. Metabolomics state-of-the-art
The term metabolomics was first coined in 1998 (Oliver 1998) to describe the "change in the relative
concentrations of metabolites as the result of deletion or over-expression of a gene". At the same
time, the term metabolome analysis referred to the analysis of metabolites in the phenotypic profile
of E. coli (Tweeddale 1998). Later on, metabolomics was considered the detection and measurement,
under defined conditions, of cellular metabolites such as low molecular weight molecules present in
an organism or biological sample. The field also includes information on the level of metabolite
activities in the cell. Metabolites are in general defined as those small molecules, usually
intermediate and final products of metabolism, but the definition also applies to high molecular
weight molecules such as lipids, peptides and carbohydrates (sometimes referred to as "lipidomics",
etc).
Metabolomic approaches are now feasible due to the rapid improvements that have taken place during
the last decade in two areas: analytical chemistry and bioinformatics. Metabolomic methodologies
include the combination of classical technologies, such as gas chromatography-mass spectrometry
(GC-MS), or nuclear magnetic resonance (NMR), with new developments to achieve improved sensitivity
and discriminative power. Sophisticated informatic analysis and data mining are an important part of
the methodology. The complete analysis involves in silico models on metabolite-protein interactions.
This analysis can be qualitative or quantitative. In the latter case, all the conditions required
for an accurate quantification should be considered, such as the use of appropriate data standards,
etc (Nielsen 2005). Metabolomics can also help to validate in silico pathways prepared on the basis
of available genome sequences and established databases (Park 2005).
Metabolomic analysis has mainly been used in studies on plants and human pathology; in this latter
case, the attention was focused on searching for metabolites associated with disease, in other
words, "metabolites as biomarkers of disease" (Weckwerth 2005). Microbial metabolomics has initially
been devoted to explore bacterial or fungal strains carrying improved phenotypes with a certain
biotechnology usefulness value (Wang 2006). Metabolomic approaches have also been directed to the
development of new drugs addressed against novel microbial targets. A review on the basics and
applications of microbial metabolomics can be read in van der Werf et al. (Werf 2005).
4.5.2. Has the metabolomic analysis of tuberculosis actually started?
From the beginning, a unique property of mycobacterial cells called the attention of scientists: the
remarkably high lipid content of the cell envelope, which accounts for the most conspicuous
mycobacterial features, including physiology and pathogenicity (Asselineau 1998, Barry 2001) (see
Chapter 3). Many studies have been published on identification, characterization, and even practical
applications (e.g. diagnostics) of several mycobacterial lipids. Almost all books on TB or
mycobacteria have at least one chapter dedicated to lipids. Older books refer in more depth to the
structural and chemical characterization of the envelope, as well as to the biosynthesis of lipids
(Ratledge 1982, Kubica 1984); more recent books lay stress on the genetics and genes related to
lipid metabolism (Cole 2005). Thus, the analysis of the lipid metabolic profiling cannot be regarded
as a new field in mycobacteria, at least when considering all lipids as metabolites (Ortalo-Magne
1996). The main objection to doing so is the size of mycobacterial lipids. Indeed, mycobacterial
lipids are rather big and complex. Most of them have a fatty acid backbone covalently linked to
other kind of molecules, most frequently several types of saccharides (Asselineau 1998). Many lipids
also belong to the molecular structure of bacterial lipoproteins (Sutclife 2004).
A detailed revision of mycobacterial lipids has been published relatively recently (Kremer 2005).
The most representative lipids in mycobacteria are the mycolic acids. These molecules are larger in
mycobacteria, compared to those of other related bacteria, such as Corynebacterium or Nocardia.
Mycolic acids are the lipid component in the structure of complex glycolipids, including
Mycolyl-ArabinoGalactan (mAG) and Trehalose-6-6'-Dimycolate (TDM). Another important group of lipids
also contains trehalose as the glycosyl radical molecules and their fatty acids chains are
multi-methylated. This group includes Di- Tri- and Pentaacyl Trehalose (DAT, TAT and PAT) and
Sulfolipids (SL); the Phthiocerol Dimycocerosates (PDIMs) are also very important. These compounds
and the closely related Phenolic Glycolipids (PGL) participate in the integrity of the cell envelope
of M. tuberculosis. A last group of glycolipids contain D-mannan and D-arabinan in their molecules
and have been considered of relevance in bacterial pathogenicity for a long time: Lipomannans (LM)
and Lipoarabinomannans (LAM) (see Chapter 3).
Many lipids are unique to mycobacteria and therefore their metabolic analysis cannot be addressed by
comparative lipidomic studies with other bacteria. Such specific metabolic pathways are viewed, in
turn, as excellent targets for the design of new specific drugs (Draper 2000). Renewed efforts have
been applied to the detection of metabolic routes and genes that participate in the biosynthesis and
degradation of complex lipids (Brennan 2003, Reed 2004, Portevin 2004, Veyron-Churlet 2004, Trivedi
2005, Kaur 2006), as well as genes involved in lipid transportation (Domenech 2004, Jain 2005). A
review has recently been published on the biosynthesis, regulation and transport of long-chain
multiple methyl-branched fatty acids, such as PDIM, PGL, and SL (Jackson 2006). This paper updates
the knowledge on these complex topics, indicating that mycobacterial lipids share mechanisms in
their metabolic routes, and that changes in a pathway could influence another pathway; in fact, some
small molecules, namely metabolites, could be precursors of the more complex synthesis of lipids,
and also be synthesized themselves during the lipids' metabolic pathway, thus being by-products or
secondary products of the lipid's metabolism.
Few studies deal with the small metabolites from mycobacterial. A recent study on Rv2221, coding for
a highly efficient M. tuberculosis adenyl-cyclase, indicates that its catalytic activity is
regulated by fatty acids. Thus, M. tuberculosis lipids seem to be involved in signal transduction
through the main metabolite cAMP (Abdel-Motaal 2006). In fact, small metabolites are often involved
in signaling transmission in many bacteria. The relevant PhoP/PhoR two-component system was
demonstrated to be related to lipid metabolism in M. tuberculosis (Gonzalo Asensio 2006). Although
the specific signal sensed by PhoR is still unknown (Jackson 2006), some small molecules
(metabolites) might behave as its signaling effectors. In fact, the homologous two-component system
(PhoP/PhoQ) is sensored by magnesium in other bacteria (Martin-Orozco 2006).
The association of mycobacterial lipids to M. tuberculosis pathogenicity is a matter of renewed
interest (Riley 2006). A role in the establishment and progress of the pathology caused by the
tubercle bacilli has been classically assigned for years to many of those lipids (Bloom 1994).
However, most of the studies were conducted using lipids as isolated molecules, overlooking the
interactions with other molecules within the bacterial cell and the environment. In fact, the lipid
contents of the bacillus change according to the environmental conditions. Garton et al. described
an increase of lipophilic inclusions according to the lipid content in the in vitro culture medium
where bacteria were grown (Garton 2002). Lipid availability is probably not low inside man, the
natural host, and in vivo bacilli could be lipolitic rather than lipogenic (Wheeler 1994).
Trafficking of mycobacterial lipids from bacterial vacuoles to the endosomes of macrophages was
demonstrated in M. tuberculosis infected macrophages, and mycobacterial lipids were detected even in
uninfected cells. These findings indicate that through its own lipids, M. tuberculosis exerts a wide
influence on its environment that extends beyond truly infected cells (Beatty 2000). Altogether,
these data underline the great importance of the metabolomic analysis for the interpretation of the
biology of the tubercle bacillus and its relation with the host.
4.6. Concluding remarks
The availability of the first M. tuberculosis genome sequences triggered an overwhelming amount of
knowledge on the genetics and the biology of M. tuberculosis. The way was opened for comparative and
functional genomics. Scientists and medical doctors started to appreciate the potential coding
capacity of this extraordinary organism. A broad picture of the M. tuberculosis gene content and
coding capacity has been revealed. Sequencing and comparison with other genomes have shown the close
relations that exist among the members of the M. tuberculosis complex, and have allowed the
identification of a core mycobacterial genome, a minimal set of genes conserved in the different
mycobacterial species. These advances were generated in a little more than seven years by no more
than five publicly available genomes. Now, with the advent of new technologies and 21 genome
projects in process, the study of mycobacteria and comparative genomics seems not only promising but
very exciting. The application of new technologies, such as DNA microarray for the comparison of M.
tuberculosis wild isolates, is a promising approach towards understanding its natural biology and
adaptative evolution in the human population.
As research on the biology of TB expands, new and more accurate information is generated. In the
coming years, knowledge about the real coding capacity of the tubercle bacillus will increase
exponentially, and genome sequences will feed back from transcriptome and proteome analysis, filling
old gaps and opening new ones in the understanding of M. tuberculosis biology. Functional genomics
has become a key tool in the understanding of the biology of M. tuberculosis. By providing
information about the pathogenesis of the disease, it is expected to promote the discovery of
vaccine candidates and the investigation of novel drug targets. Investigations on complex biological
systems can be now envisaged under a metabolomic perspective (Forst 2006). Metabolomics is a newborn
methodology in microbiology and is even younger in mycobacteriology, therefore, almost everything
remains to be learned in TB concerning that discipline.
It is clear that a long way still remains to be walked to understand how the tubercle bacillus
behaves inside the host, its unique known environment. A more comprehensive integration of the
knowledge generated by genomics, transcriptomics, proteomics and various molecular tools will surely
provide a clearer picture of the amazing pathogen M. tuberculosis and the illness that it causes.
References
1. Abdel Motaal A, Tews I, Schultz JE, Linder JU. Fatty acid regulation of adenylyl cyclase Rv2212
from Mycobacterium tuberculosis H37Rv. FEBS J 2006; 273: 4219-28.
2. Agarwal N, Raghunand TR, Bishai WR. Regulation of the expression of whiB1 in Mycobacterium
tuberculosis: role of cAMP receptor protein. Microbiology 2006; 152: 2749-56.
3. Asselineau J, Laneelle G. Mycobacterial lipids: a historical perspective. Front Biosci 1998; 3:
164-74.
4. Bagchi G, Chauhan S, Sharma D, Tyagi JS. Transcription and autoregulation of the
Rv3134c-devR-devS operon of Mycobacterium tuberculosis. Microbiology 2005; 151: 4045-53.
5. Banaiee N, Jacobs WR Jr, Ernst JD. Regulation of Mycobacterium tuberculosis whiB3 in the mouse
lung and macrophages. Infect Immun 2006; 74: 6449-57.
6. Barnard A, Wolfe A, Busby S. Regulation at complex bacterial promoters: how bacteria use
different promoter organizations to produce different regulatory outcomes. Curr Opin Microbiol 2004;
7: 102-8.
7. Barry CE 3rd. Interpreting cell wall ´virulence factors´ of Mycobacterium tuberculosis. Trends
Microbiol 2001; 9: 237-41.
8. Bashyam MD, Hasnain SE. The extracytoplasmic function sigma factors: role in bacterial
pathogenesis. Infect Genet Evol 2004; 4: 301-8.
9. Beaucher J, Rodrigue S, Jacques P-E, Smith I, Brzezinski R, Gaudreau L. Novel Mycobacterium
tuberculosis anti-s factor antagonists control sF activity by distinct mechanisms. Mol Microbiol
2002; 45: 1527-40.
10. Beatty WL, Rhoades ER, Ullrich HJ, Chatterjee D, Heuser JE, Russell DG. Trafficking and release
of mycobacterial lipids from infected macrophages. Traffic 2000; 1: 235-47.
11. Behr MA, Wilson MA, Gill WP, et al. Comparative genomics of BCG vaccines by whole-genome DNA
microarray. Science 1999; 284: 1520-3.
12. Betts JC, Dodson P, Quan S, et al. Comparison of the proteome of Mycobacterium tuberculosis
strain H37Rv with clinical isolate CDC 1551. Microbiology 2000; 146: 3205-16.
13. Betts JC. Transcriptomics and proteomics: tools for the identification of novel drug targets and
vaccine candidates for tuberculosis. IUBMB Life 2002; 53: 239-42.
14. Bloom BR (Ed.) Tuberculosis. Pathogenesis, protection and control. ASM Press. Washington DC.
1994.
15. Brennan PJ. Structure, function, and biogenesis of the cell wall of Mycobacterium tuberculosis.
Tuberculosis (Edinb) 2003; 83: 91-7.
16. Brodin P, Eiglmeier K, Marmiesse M, et al. Bacterial artificial chromosome-based comparative
genomic analysis identifies Mycobacterium microti as a natural ESAT-6 deletion mutant. Infect Immun
2002; 70: 5568-78.
17. Brosch R, Philipp WJ, Stavropoulos E, Colston MJ, Cole ST, Gordon SV. Genomic analysis reveals
variation between Mycobacterium tuberculosis H37Rv and the attenuated M. tuberculosis H37Ra strain.
Infect Immun 1999; 67: 5768-74.
18. Brosch R, Gordon SV, Pym A, Eiglmeier K, Garnier T, Cole ST. Comparative genomics of the
mycobacteria. Int J Med Microbiol 2000a; 290: 143-52.
19. Brosch R, Gordon SV, Buchrieser C, Pym AS, Garnier T, Cole ST. Comparative genomics uncovers
large tandem chromosomal duplications in Mycobacterium bovis BCG Pasteur. Yeast 2000b; 17: 111-23.
20. Brosch R, Gordon SV, Eiglmeier K, Garnier T, Cole ST. Comparative genomics of the leprosy and
tubercle bacilli. Res Microbiol 2000c; 151: 135-42.
21. Brosch R, Pym AS, Gordon SV, Cole ST. The evolution of mycobacterial pathogenicity: clues from
comparative genomics. Trends Microbiol 2001; 9: 452-8.
22. Brosch R, Gordon SV, Marmiesse M, et al. A new evolutionary scenario for the Mycobacterium
tuberculosis complex. Proc Natl Acad Sci U S A 2002; 99: 3684-9.
23. Camus JC, Pryor MJ, Medigue C, Cole ST. Re-annotation of the genome sequence of Mycobacterium
tuberculosis H37Rv. Microbiology 2002; 148: 2967-73.
24. Canneva F, Branzoni M, Riccardi G, Provvesi R, Milano A. Rv2358 and FurB: two transcriptional
regulators from Mycobacterium tuberculosis which respond to zinc. J Bacteriol 2005; 187: 5837-40.
25. Capelli G, Volpe E, Grassi M, Liseo B, Colizzi V, Mariani F. Profiling of Mycobacterium
tuberculosis gene expression during human macrophage infection: upregulation of the alternative
sigma factor G, a group of transcriptional regulators, and proteins with unknown function. Res
Microbiol 2006; 157: 445-55.
26. Chandler M, Mahillon J. Insertion sequences revisited. In: Mobile DNA II Craig NL, Craigie R,
Gellert M, Lambowitz AM (Eds.) ASM Press. 2002.
27. Cho SH, Goodlett D, Franzblau S. ICAT-based comparative proteomic analysis of non-replicating
persistent Mycobacterium tuberculosis. Tuberculosis (Edinb) 2006; 86: 445-60.
28. Clark-Curtiss JE, Haydel SE. Molecular genetics of Mycobacterium tuberculosis pathogenesis. Annu
Rev Microbiol 2003; 57: 517-49.
29. Cole ST, Saint Girons I. Bacterial genomics. FEMS Microbiol Rev 1994; 14: 139-60.
30. Cole ST, Brosch R, Parkhill J, et al. Deciphering the biology of Mycobacterium tuberculosis from
the complete genome sequence. Nature 1998a; 393: 537-44.
31. Cole ST. Comparative mycobacterial genomics. Curr Opin Microbiol 1998b; 1: 567-71.
32. Cole ST. Learning from the genome sequence of Mycobacterium tuberculosis H37Rv. FEBS Lett 1999;
452: 7-10.
33. Cole ST, Eiglmeier K, Parkhill J, et al. Massive gene decay in the leprosy bacillus. Nature
2001; 409: 1007-11.
34. Cole ST. Comparative and functional genomics of the Mycobacterium tuberculosis complex.
Microbiology 2002a; 148: 2919-28.
35. Cole ST. Comparative mycobacterial genomics as a tool for drug target and antigen discovery. Eur
Respir J Suppl 2002b; 36: 78s-86s.
36. Cole ST, Eisenack KD, McMurray DN, Jacobs WR. (Eds.) Tuberculosis and the tubercle bacillus. ASM
Press. Washington DC. 2005.
37. Dainese E, Rodrigue S, Delogu G, et al. Posttranslational regulation of Mycobacterium
tuberculosis extracytoplasmic-function sigma factor sL and roles in virulence and in global
regulation of gene expression. Infect Immun 2006; 74: 2457-60.
38. Delogu G, Sanguinetti M, Posteraro B, Rocca S, Zanetti S, Fadda G. The hbhA gene of
Mycobacterium tuberculosis is specifically upregulated in the lungs but not in the spleens of
aerogenically infected mice. Infect Immun 2006; 74: 3006-11.
39. Dheenadhayalan V, Delogu G, Sanguinetti M, Fadda G, Brennan MJ. Variable expression patterns of
Mycobacterium tuberculosis PE_PGRS genes: evidence that PE_PGRS16 and PE_PGRS26 are inversely
regulated in vivo. J Bacteriol 2006; 188: 3721-5.
40. Domenech P, Barry CE 3rd, Cole ST. Mycobacterium tuberculosis in the post-genomic age. Curr Opin
Microbiol 2001; 4: 28-34.
41. Domenech P, Reed MB, Dowd CS, Manca C, Kaplan G, Barry CE 3rd. The role of MmpL8 in sulfatide
biogenesis and virulence of Mycobacterium tuberculosis. J Biol Chem 2004; 279: 21257-65.
42. Draper P. Lipid biochemistry takes a stand against tuberculosis. Nat Med 2000; 6: 977-8.
43. Ewann F, Jackson M, Pethe K, et al. Transient requirement of the PrrA-PrrB two-component system
for early intracellular multiplication of Mycobacterium tuberculosis. Infect Immun 2002; 70:
2256-63.
44. Ewann F, Loch C, Supply P. Intracellular autoregulation of the Mycobacterium tubreculosis PrrA
response regulator. Microbiology 2004; 150: 241-6.
45. Fiehn O, Weckwerth W. Deciphering metabolic networks. Eur J Biochem 2003; 270: 579-88.
46. Filliol I, Motiwala AS, Cavatore M, et al. Global phylogeny of Mycobacterium tuberculosis based
on single nucleotide polymorphism (SNP) analysis: insights into tuberculosis evolution, phylogenetic
accuracy of other DNA fingerprinting systems, and recommendations for a minimal standard SNP set. J
Bacteriol 2006; 188: 759-72.
47. Finlay BB, Falkow S. Common themes in microbial pathogenicity revisited. Microbiol Mol Biol Rev
1997; 61: 136-69.
48. Fleischmann RD, Adams MD, White O, et al. Whole-genome random sequencing and assembly of
Haemophilus influenzae Rd. Science 1995; 269: 496-512.
49. Fleischmann RD, Alland D, Eisen JA, et al. Whole-genome comparison of Mycobacterium tuberculosis
clinical and laboratory strains. J Bacteriol 2002; 184: 5479-90.
50. Forst CV. Host-pathogen systems biology. Drug Discov Today 2006; 11: 220-7.
51. Gagneux S, DeRiemer K, Van T, et al. Variable host-pathogen compatibility in Mycobacterium
tuberculosis. Proc Natl Acad Sci U S A 2006; 103: 2869-73.
52. Garbe TR, Hibler NS, Deretic V. Isoniazid induces expression of the antigen 85 complex in
Mycobacterium tuberculosis. Antimicrob Agents Chemother 1996; 40: 1754-6.
53. Garnier T, Eiglmeier K, Camus JC, et al. The complete genome sequence of Mycobacterium bovis.
Proc Natl Acad Sci U S A 2003; 100: 7877-82.
54. Garton NJ, Christensen H, Minnikin DE, Adegbola RA, Barer MR. Intracellular lipophilic
inclusions of mycobacteria in vitro and in sputum. Microbiology 2002; 148: 2951-8.
55. Geiman DE, Kaushal D, Ko C, et al. Attenuation of late-stage disease in mice infected by the
Mycobacterium tuberculosis mutant lacking the sigF alternate sigma factor and identification of
sigF-dependent genes by microarrays analysis. Infect Immun 2004; 72: 1733-45.
56. Geiman DE Raghunan TR, Agarwal N, Bishai WR. Differential gene expression in response to
exposure to antimycobacterial agents and other stress conditions among seven Mycobacterium
tuberculosis whiB-like genes. Antimicrob Agents Chemother 2006; 50: 2836-41.
57. Gey Van Pittius NC, Gamieldien J, Hide W, Brown GD, Siezen RJ, et al. The ESAT-6 gene cluster of
Mycobacterium tuberculosis and other high G+C Gram-positive bacteria. Genome Biol 2001; 2:
research0044.1-0044.18.
58. Gonzalo Asensio J, Maia C, Ferrer NL, et al. The virulence-associated two-component PhoP-PhoR
system controls the biosynthesis of polyketide-derived lipids in Mycobacterium tuberculosis. J Biol
Chem 2006; 281: 1313-6.
59. Gordon SV, Brosch R, Billault A, Garnier T, Eiglmeier K, Cole ST. Identification of variable
regions in the genomes of tubercle bacilli using bacterial artificial chromosome arrays. Mol
Microbiol 1999; 32: 643-55.
60. Gupta S, Sinha A, Sarkar D. Transcriptional autoregulation by Mycobacterium tuberculosis PhoP
involves recognition of novel direct repeat sequences in the regulatory region of the promoter. FEBS
Lett 2006; 580: 5328-38.
61. Gutacker MM, Smoot JC, Migliaccio CA, et al. Genome-wide analysis of synonymous single
nucleotide polymorphisms in Mycobacterium tuberculosis complex organisms: resolution of genetic
relationships among closely related microbial strains. Genetics 2002; 162: 1533-43.
62. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R. Quantitative analysis of complex
protein mixtures using isotope-coded affinity tags. Nat Biotechnol 1999; 17: 994-9.
63. Gu S, Chen J, Dobos KM, Bradbury EM, Belisle JT, Chen X. Comprehensive proteomic profiling of
the membrane constituents of a Mycobacterium tuberculosis strain. Mol Cell Proteomics 2003; 2:
1284-96.
64. Hahn MY, Raman S, Anaya M, Husson RN. The Mycobacterium tuberculosis extracytoplasmic-function
sigma factor SigL regulates polyketide synthases and secreted or membrane proteins and is requiered
for virulence. J Bacteriol 2005; 187: 7062-71.
65. Haydel SE, Benjamin Jr WH, Dunlap NE, Clark-Curtiss JE. Expression, autoregulation and DNA
binding properties of the Mycobacterium tuberculosis TrcR response regulator. J Bacteriol 2002; 184:
2192-203.
66. Haydel SE, Clark-Curtiss. Global expression analysis of two-component system regulator genes
during Mycobacterium tuberculosis growth in human macrophages. FEMS Microbiol Lett 2004; 236: 341-7.
67. He H, Zahrt TC. Identification and characterization of a regulatory sequence recognized by
Mycobacterium tuberculosis persistence regulator MprA. J Bacteriol 2005; 187: 202-12.
68. He H, Hovey R, Kane J, Singh V, Zahrt TC. MprA is a stress-responsive two-component system that
directly regulates expression of sigma factors SigB and SigE in Mycobacterium tuberculosis. J
Bacteriol 2006; 188: 2134-43.
69. Himpens S, Loch C, Supply P. Molecular characterization of the mycobacterial SenX3-RegX3
two-component system: evidence for autoregulation. Microbiology 2000; 146: 3091-8.
70. Hirschfield GR, McNeil M, Brennan PJ. Peptidoglycan-associated polypeptides of Mycobacterium
tuberculosis. J Bacteriol 1990; 172: 1005-13.
71. Hirsh AE, Tsolaki AG, DeRiemer K, Feldman MW, Small PM. Stable association between strains of
Mycobacterium tuberculosis and their human host populations. Proc Natl Acad Sci U S A 2004; 101:
4871-6.
72. Isobe T, Uchida K, Taoka M, Shinkai F, Manabe T, Okuyama T. Automated two-dimensional liquid
chromatographic system for mapping proteins in highly complex mixtures. J Chromatogr 1991; 588:
115-23.
73. Hughes AL, Friedman R, Murray M. Genomewide pattern of synonymous nucleotide substitution in two
complete genomes of Mycobacterium tuberculosis. Emerg Infect Dis 2002; 8: 1342-6.
74. Jackson M, Stadthagen G, Gicquel B. Long-chain multiple methyl-branched fatty acid-containing
lipids of Mycobacterium tuberculosis: Biosynthesis, transport, regulation and biological activities.
Tuberculosis (Edinb) 2007; 87: 78-86.
75. Jain M, Cox JS. Interaction between polyketide synthase and transporter suggests coupled
synthesis and export of virulence lipid in M. tuberculosis. PLoS Pathog 2005; 1: 12-9.
76. Johansen KA, Gill RE, Vasil ML. Biochemical and molecular analysis of phospholipase C and
phospholipase D activity in mycobacteria. Infect Immun 1996; 64: 3259-66.
77. Jungblut PR, Schaible UE, Mollenkopf HJ, et al. Comparative proteome analysis of Mycobacterium
tuberculosis and Mycobacterium bovis BCG strains: towards functional genomics of microbial
pathogens. Mol Microbiol 1999; 33: 1103-17.
78. Jungblut PR, Muller EC, Mattow J, Kaufmann SH. Proteomics reveals open reading frames in
Mycobacterium tuberculosis H37Rv not predicted by genomics. Infect Immun 2001; 69: 5905-7.
79. Kanduma E, McHugh TD, Gillespie SH. Molecular methods for Mycobacterium tuberculosis strain
typing: a users guide. J Appl Microbiol 2003; 94: 781-91.
80. Kaur D, Berg S, Dinadayala P, et al. Biosynthesis of mycobacterial lipoarabinomannan: role of a
branching mannosyltransferase. Proc Natl Acad Sci U S A 2006; 103: 13664-9.
81. Kazmierczak MJ, Wiedmann M, Boor KJ. Alternative sigma factors and their roles in bacterial
virulence. Microbiol Mol Biol Rev 2005; 69: 527-43.
82. Kempsell KE, Ji YE, Estrada IC, Colston MJ, Cox RA. The nucleotide sequence of the promoter, 16S
rRNA and spacer region of the ribosomal RNA operon of Mycobacterium tuberculosis and comparison with
Mycobacterium leprae precursor rRNA. J Gen Microbiol 1992; 138: 1717-27.
83. Kendall SL, Movahedzadeh F, Rison SCG, Wernisch L, Paris T, Duncan K, Betts JC, Stoker NG. The
Mycobacterium tuberculosis dosRS two-component system is induced by multiple stresses. Tuberculosis
2004; 84: 247-55.
84. Kinhikar AG, Vargas D, Li H, et al. Mycobacterium tuberculosis malate synthase is a
laminin-binding adhesin. Mol Microbiol 2006; 60: 999-1013.
85. Kremer L, Besra GS. A waxy tale, by Mycobacterium tuberculosis. Chapter 19, pgs:287-305. In
Tuberculosis and the tubercle bacillus. Cole ST (Ed.) ASM Press. 2005.
86. Kubica GP, Wayne LG (Eds.) The Mycobacteria: A sourcebook. Part A. Dekker mdi. Microbiology
series Vol 15. 1984.
87. Lee BY, Hefta SA, Brennan PJ. Characterization of the major membrane protein of virulent
Mycobacterium tuberculosis. Infect Immun 1992; 60: 2066-74.
88. Li L, Bannantine JP, Zhang Q, et al. The complete genome sequence of Mycobacterium avium
subspecies paratuberculosis. Proc Natl Acad Sci U S A 2005; 102: 12344-9.
89. Mahairas GG, Sabo PJ, Hickey MJ, Singh DC, Stover CK. Molecular analysis of genetic differences
between Mycobacterium bovis BCG and virulent M. bovis. J Bacteriol 1996; 178: 1274-82.
90. Manganelli R, Dubnau E, Tyagi S, Kramer FR, Smith I. Differential expression of 10 sigma factor
genes in Mycobacterium tuberculosis. Mol Microbiol 1999; 31: 715-24.
91. Manganelli R, Voskuil MI, Schoolnik GK, Smith I. The Mycobacterium tuberculosis ECF sigma factor
sE: role in global gene expression and survival in macrophages, Mol Microbiol 2001; 41: 423-37.
92. Manganelli R, Voskuil MI, Schoolnik GK, Dubnau E, Gomez M, Smith I. Role of the
extracytoplasmic-function sH in Mycobacterium tuberculosis global gene expression. Mol Microbiol
2002; 45: 365-74.
93. Manganelli R, Fattarini L, Tan D, et al. The extra cytoplasmic function sigma factor sE is
essential for Mycobacterium tuberculosis virulence in mice. Infect Immun 2004; 72: 3038-41.
94. Marmiesse M, Brodin P, Buchrieser C, et al. Macro-array and bioinformatic analyses reveal
mycobacterial ´core´ genes, variation in the ESAT-6 gene family and new phylogenetic markers for the
Mycobacterium tuberculosis complex. Microbiology 2004; 150: 483-96.
95. Marri PR, Bannantine JP, Golding GB. Comparative genomics of metabolic pathways in Mycobacterium
species: gene duplication, gene decay and lateral gene transfer. FEMS Microbiol Rev 2006; 30:
906-25.
96. Marsh IB, Whittington RJ. Deletion of an mmpL gene and multiple associated genes from the genome
of the S strain of Mycobacterium avium subsp. paratuberculosis identified by representational
difference analysis and in silico analysis. Mol Cell Probes 2005; 19: 371-84.
97. Martin-Orozco N, Touret N, Zaharik ML, et al. Visualization of vacuolar acidification-induced
transcription of genes of pathogens inside macrophages. Mol Biol Cell 2006; 17: 498-510.
98. Mattow J, Jungblut PR, Schaible UE, et al. Identification of proteins from Mycobacterium
tuberculosis missing in attenuated Mycobacterium bovis BCG strains. Electrophoresis 2001; 22:
2936-46.
99. Mattow J, Siejak F, Hagens K, et al. Proteins unique to intraphagosomally grown Mycobacterium
tuberculosis. Proteomics 2006; 6: 2485-94.
100. Mawuenyega KG, Forst CV, Dobos KM, et al. Mycobacterium tuberculosis functional network
analysis by global subcellular protein profiling. Mol Biol Cell 2005; 16: 396-404.
101. Mizrahi V, Andersen SJ. DNA repair in Mycobacterium tuberculosis. What have we learnt from the
genome sequence? Mol Microbiol 1998; 29: 1331-9.
102. Molle V, Palframan WJ, Findlay KC, Buttner MJ. WhiD and WhiB, homologous proteins required for
different stages of sporulation in Streptomyces coelicolor A3 (2). J Bacteriol 2000; 182: 1286-95.
103. Mollenkopf HJ, Jungblut PR, Raupach B, et al. A dynamic two-dimensional polyacrylamide gel
electrophoresis database: the mycobacterial proteome via Internet. Electrophoresis 1999; 20:
2172-80.
104. Mooney RA, Darst SA, Landick R. Sigma and RNA polymerase: an on-again, off-again relationship?
Molecular Cell 2005; 20: 335-45.
105. Nagai S, Wiker HG, Harboe M, Kinomoto M. Isolation and partial characterization of major
protein antigens in the culture fluid of Mycobacterium tuberculosis. Infect Immun 1991; 59: 372-82.
106. Newton SM, Smith RJ, Wilkinson KA, et al. A deletion defining a common Asian lineage of
Mycobacterium tuberculosis associates with immune subversion. Proc Natl Acad Sci U S A 2006; 103:
15594-8.
107. Nielsen J, Oliver S. The next wave in metabolome analysis. Trends Biotechnol 2005; 23: 544-6.
108. Ochman H, Moran NA. Genes lost and genes found: evolution of bacterial pathogenesis and
symbiosis. Science 2001; 292: 1096-9.
109. O´Farrell PH. High resolution two-dimensional electrophoresis of proteins. J Biol Chem 1975;
250: 4007-21.
110. Oliver SG, Winson MK, Kell DB, Baganz F. Systematic functional analysis of the yeast genome.
Trends Biotechnol 1998; 16: 373-8.
111. Ortalo-Magne A, Lemassu A, Laneelle MA, et al. Identification of the surface-exposed lipids on
the cell envelopes of Mycobacterium tuberculosis and other mycobacterial species. J Bacteriol 1996;
178: 456-61.
112. Parida BK, Douglas T, Nino C, Dhandayuthapani. Interactions of anti-sigma factor antagonists of
Mycobacyerium tuberculosis in the yeast two-hybrid system. Tuberculosis (Edinb) 2005; 85: 347-55.
113. Parish T, Smith DA, Roberts G, Betts J, Stoker NG. The senX3-regX3 two-component regulatory
system of Mycobacterium tuberculosis is required for virulence. Microbiology 2003a; 149: 1423-35.
114. Parish T. Smith DA, Kendall S, Casali N, Bancroft GJ, Stoker NG. Deletion of two-coponent
regulatory systems increases the virulence of Mycobacterium tuberculosis. Infect Immun 2003b; 71:
1134-40.
115. Park H-D, Guinn KM, Harrell MI, Liao R, Voskuil MI, Tompa M, Schoolnik GK, Sherman DR.
Rv3133c/dosR is a transcription factor that mediates the hypoxic response of Mycobacterium
tuberculosis Mol Microbiol 2003; 48: 833-43.
116. Park SJ, Lee SY, Cho J, et al. Global physiological understanding and metabolic engineering of
microorganisms based on omics studies. Appl Microbiol Biotechnol 2005; 68: 567-79.
117. Patterson SD. Proteomics: the industrialization of protein chemistry. Curr Opin Biotechnol
2000; 11: 413-8.
118. Philipp WJ, Poulet S, Eiglmeier K, et al. An integrated map of the genome of the tubercle
bacillus, Mycobacterium tuberculosis H37Rv, and comparison with Mycobacterium leprae. Proc Natl Acad
Sci U S A 1996; 93: 3132-7.
119. Portevin D, De Sousa-D´Auria C, Houssin C, et al. A polyketide synthase catalyzes the last
condensation step of mycolic acid biosynthesis in mycobacteria and related organisms. Proc Natl Acad
Sci U S A 2004; 101: 314-9.
120. Rajakumar K, Shafi J, Smith RJ, et al. Use of genome level-informed PCR as a new
investigational approach for analysis of outbreak-associated Mycobacterium tuberculosis isolates. J
Clin Microbiol 2004; 42: 1890-6.
121. Raman S, Song T, Puyang X, Bardarov S, Jacons Jr. WR, Husson RN. The alternative sigma factor
SigH regulates major components of oxidative and heat stress response in Mycobacterium tuberculosis.
J Bacteriol 2001; 183: 6119-25.
122. Raman S, Hazra R, Dascher CC, Hussin RN. Transcription regulation by the Mycobacterium
tuberculosis alternative sigma factor SigD and its role in virulence. J Bacteriol 2004; 186:
6605-16.
123. Raman S, Puyang X, Cheng T-Y, Young DC, Moody DB, Husson RN. Mycobacterium tuberculosis SigM
positively regulates Esx secreted proteins and nonribosomal peptide synthetase genes and down
regulates virulence-associated surface lipid synthesis. J Bacteriol 2006; 188: 8460-8.
124. Ratledge C, Stanford J. (Eds.) The Biology of the Mycobacteria. Vol I. Academic Press.
London/NewYork. 1982.
125. Reed MB, Domenech P, Manca C, et al. A glycolipid of hypervirulent tuberculosis strains that
inhibits the innate immune response. Nature 2004; 431: 84-7.
126. Rickman L, Scott C, Hunt DM, et al. A member of the cAMP receptor protein family of
transcription regulators in Mycobacterium tuberculosis is required for virulence in mice and
controls transcription of the rpfA gene coding for a resuscitation promoting factor. Mol Microbiol
2005; 56: 1274-86.
127. Riley LW. Of mice, men, and elephants: Mycobacterium tuberculosis cell envelope lipids and
pathogenesis. J Clin Invest 2006; 116: 1475-8.
128. Rodrigue S, Provvedi R, Jacques P-E, Gaudreau L, Manganelli R. The s factors of Mycobacterium
tuberculosis. FEMS Microbiol Rev 2006; 30: 926-41.
129. Rodriguez GM, Smith I. Mechanisms of iron regulation in mycobacteria: role in physiology and
virulence. Mol Microbiol 2003; 1485-94.
130. Rosenkrands I, Weldingh K, Jacobsen S, et al. Mapping and identification of Mycobacterium
tuberculosis proteins by two-dimensional gel electrophoresis, microsequencing and immunodetection.
Electrophoresis 2000a; 21: 935-48.
131. Rosenkrands I, King A, Weldingh K, Moniatte M, Moertz E, Andersen P. Towards the proteome of
Mycobacterium tuberculosis. Electrophoresis 2000b; 21: 3740-56.
132. Rosenkrands I, Slayden RA, Crawford J, Aagaard C, Barry CE 3rd, Andersen P. Hypoxic response of
Mycobacterium tuberculosis studied by metabolic labeling and proteome analysis of cellular and
extracellular proteins. J Bacteriol 2002; 184: 3485-91.
133. Saïd-Salim B, Mostowy S, Kristof AS, Behr MA. Mutations in Mycobacterium tuberculosis Rv0444c,
the gene encoding anti-SigK, explain high level expression of MPB70 and MPB83 in Mycobacterium
bovis. Mol Microbiol 2006; 62: 1251-63.
134. Sassetti CM, Boyd DH, Rubin EJ. Comprehensive identification of conditionally essential genes
in mycobacteria. Proc Natl Acad Sci U S A 2001; 98: 12712-7.
135. Schnappinger D, Ehrt S, Voskuil MI, et al. Transcriptional adaptation of Mycobacterium
tuberculosis within macrophages: insights into the phagosomal environment. J Exp Med 2003; 198:
693-704.
136. Schnappinger D, Schoolnik GK, Ehrt S. Expression profiling of host pathogen interactions: how
Mycobacterium tuberculosis and the macrophage adapt to one another. Microbes Infect 2006; 8:
1132-40.
137. Schmidt F, Donahoe S, Hagens K, et al. Complementary analysis of the Mycobacterium tuberculosis
proteome by two-dimensional electrophoresis and isotope-coded affinity tag technology. Mol Cell
Proteomics 2004; 3: 24-42.
138. Schoolnik GK. Functional and comparative genomics of pathogenic bacteria. Curr Opin Microbiol
2002; 5: 20-6.
139. Sherman DR, Voskuil M, Schnappinger D, Liao R, Harrel MI, Schoolnik GK. Regulation of the
Mycobacterium tuberculosis hypoxic response gene encoding alpha-crystallin Proc Natl Acad Sci USA
2001; 98: 7534-9.
140. Shires K, Steyn L. The cold-shock stress response in Mycobacterium smegmatis induces the
expression of a histone-like protein. Mol Microbiol 2001; 39: 994-1009.
141. Sinha S, Kosalai K, Arora S, et al. Immunogenic membrane-associated proteins of Mycobacterium
tuberculosis revealed by proteomics. Microbiology 2005; 151: 2411-9.
142. Simpson RJ, Connolly LM, Eddes JS, Pereira JJ, Moritz RL, Reid GE. Proteomic analysis of the
human colon carcinoma cell line (LIM 1215): development of a membrane protein database.
Electrophoresis 2000; 21: 1707-32.
143. Sirakova TD, Dubey VS, Kim HJ, Cynamon MH, Kolattukudy PE. The largest open reading frame
(pks12) in the Mycobacterium tuberculosis genome is involved in pathogenesis and dimycocerosyl
phthiocerol synthesis. Infect Immun 2003; 71: 3794-801.
144. Sola C, Filliol I, Legrand E, Mokrousov I, Rastogi N. Mycobacterium tuberculosis phylogeny
reconstruction based on combined numerical analysis with IS1081, IS6110, VNTR, and DR-based
spoligotyping suggests the existence of two new phylogeographical clades. J Mol Evol 2001; 53:
680-9.
145. Soliveri JA, Gomez J, Bishai WR, Chater KF. Multiple paralogous genes related to the
Streptomyces coelicolor developmental regulatory gene whiB are present in Streptomyces and other
actinomyetes. Microbiology 2000; 146: 333-43.
146. Sonden A, Kocincova D, Deshayes C, et al. Gap, a mycobacterial specific integral membrane
protein, is requiered for glycolipid transport to the cell surface. Mol Microbiol 2005; 58: 426-40.
147. Song T, Dove SL, Lee KH, Husson RN. Rsh, an anti-sigma factor that regulates the activity of
the mycobacterial stress response sigma factor SigH. Mol Microbiol 2003; 50: 949-59.
148. Sonnenberg MG, Belisle JT. Definition of Mycobacterium tuberculosis culture filtrate proteins
by two-dimensional polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and
electrospray mass spectrometry. Infect Immun 1997; 65: 4515-24.
149. Sreevatsan S, Pan X, Stockbauer KE, et al. Restricted structural gene polymorphism in the
Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc Natl
Acad Sci U S A 1997; 94: 9869-74.
150. Stanley SA, Raghavan S, Hwang WW, Cox JS. Acute infection and macrophage subversion by
Mycobacterium tuberculosis require a specialized secretion system. Proc Natl Acad Sci U S A 2003;
100: 13001-6.
151. Starck J, Kallenius G, Marklund BI, Andersson DI, Akerlund T. Comparative proteome analysis of
Mycobacterium tuberculosis grown under aerobic and anaerobic conditions. Microbiology 2004; 150:
3821-9.
152. Stewart GR, Wernisch L, Stabler R, et al. Disecction of the heat-shock response in
Mycobacterium tuberculosis using mutants and microarrays. Microbiology 2002; 148: 3129-38.
153. Steyn AJ, Collins DM, Hondalus MK, Jacobs WR Jr, Kawakami RP, Bloom BR. Mycobacterium
tuberculosis WhiB3 interacts with ProV to affect host survival but is dispensable for in vivo
growth. Proc Natl Acad Sci USA 2002; 5: 3147-52.
154. Sun R, Converse PJ, Ko Ch, Tyagi S, Morrison NE, Bishai WR. Mycobacterium tuberculosis ECF
sigma factor sigC is required for lethality in mice and for the conditional expression of a defined
gene set. Mol Microbiol 2004; 52: 25-38.
155. Sutcliffe IC, Harrington DJ. Lipoproteins of Mycobacterium tuberculosis: an abundant and
functionally diverse class of cell envelope components. FEMS Microbiol Rev 2004; 28: 645-59.
156. Taboada EN, Acedillo RR, Luebbert CC, Findlay WA, Nash JH. A new approach for the analysis of
bacterial microarray-based Comparative Genomic Hybridization: insights from an empirical study. BMC
Genomics 2005; 6: 78.
157. Talaat AM, Lyons R, Howard ST, Johnston SA. The temporal expression profile of Mycobacterium
tuberculosis infection in mice. Proc Natl Acad Sci USA 2004; 101: 4602-7.
158. Tekaia F, Gordon SV, Garnier T, Brosch R, Barrell BG, Cole ST. Analysis of the proteome of
Mycobacterium tuberculosis in silico. Tuber Lung Dis 1999; 79: 329-42.
159. Timm J, Post FA, Bekker L-G, et al. Differential expression of iron-, carbon-, and
oxygen-responsive mycobacterial genes in the lungs of chronically infected mice and tuberculosis
patients. Proc Natl Acad Sci USA 2003: 100: 14321-6.
160. Trivedi OA, Arora P, Vats A, et al. Dissecting the mechanism and assembly of a complex
virulence mycobacterial lipid. Mol Cell 2005; 17: 631-43.
161. Tsolaki AG, Hirsh AE, DeRiemer K, et al. Functional and evolutionary genomics of Mycobacterium
tuberculosis: insights from genomic deletions in 100 strains. Proc Natl Acad Sci U S A 2004; 101:
4865-70.
162. Tsolaki AG, Gagneux S, Pym AS, et al. Genomic deletions classify the Beijing/W strains as a
distinct genetic lineage of Mycobacterium tuberculosis. J Clin Microbiol 2005; 43: 3185-91.
163. Tweeddale H, Notley-McRobb L, Ferenci T. Effect of slow growth on metabolism of Escherichia
coli, as revealed by global metabolite pool ("metabolome") analysis. J Bacteriol 1998; 180: 5109-16.
164. Urquhart BL, Cordwell SJ, Humphery-Smith I. Comparison of predicted and observed properties of
proteins encoded in the genome of Mycobacterium tuberculosis H37Rv. Biochem Biophys Res Commun 1998;
253: 70-9.
165. van der Werf MJ, Jellema RH, Hankemeier T. Microbial metabolomics: replacing trial-and-error by
the unbiased selection and ranking of targets. J Ind Microbiol Biotechnol 2005; 32: 234-52.
166. van Embden JD, Cave MD, Crawford JT, et al. Strain identification of Mycobacterium tuberculosis
by DNA fingerprinting: recommendations for a standardized methodology. J Clin Microbiol 1993; 31:
406-9.
167. Veyron-Churlet R, Guerrini O, Mourey L, Daffe M, Zerbib D. Protein-protein interactions within
the Fatty Acid Synthase-II system of Mycobacterium tuberculosis are essential for mycobacterial
viability. Mol Microbiol 2004; 54: 1161-72.
168. Volpe E, Cappelli G, Grassi M, et al. Gene expression profiling of human macrophages at late
time of infection with Mycobacterium tuberculosis. Immunology 2006; 118: 449-60.
169. Voskuil MI, Schnappinger D, Visconti KC, et al. Inhibition of respiration by nitric oxide
induces a Mycobacterium tuberculosis dormancy program. J Exp Med 2003; 198: 705-13.
170. Walters SB, Dubnau E, Kolesnikova I, Laval F, Daffe M, Smith I. The Mycobacterium tuberculosis
PhoPR two-component system regulates genes essential for virulence and complex lipid biosynthesis.
Mol Microbiol 2006; 60: 312-30.
171. Wang QZ, Wu CY, Chen T, Chen X, Zhao XM. Integrating metabolomics into a systems biology
framework to exploit metabolic complexity: strategies and applications in microorganisms. Appl
Microbiol Biotechnol 2006; 70: 151-61.
172. Wayne LG, Hayes LG. An in vitro model for sequential study of shiftdown of Mycobacterium
tuberculosis through two stages of nonreplicating persistence. Infect Immun 1996; 64: 2062-9.
173. Wayne LG, Sohaskey CD. Nonreplicating persistance of Mycobacterium tuberculosis. Annu Rev
Microbiol 2001; 55: 139-63.
174. Weckwerth W, Morgenthal K. Metabolomics: from pattern recognition to biological interpretation.
Drug Discov Today 2005; 10: 1551-8.
175. Wheeler PR, Ratledge C. Metabolism of Mycobacterium tuberculosis. Chapter 23, pgs:353-385. In
Tuberculosis and the tubercle bacillus Bloom BR (Ed.) ASM Press. Washington DC. 1994.
176. West AH, Stock AM. Histidine kinases and response regulator proteins in two-component signaling
systems. Trends Biochem Sci 2001; 26: 369-76.
177. Zhang Y, Wallace RJ Jr, Mazurek GH. Genetic differences between BCG substrains. Tuber Lung Dis
1995; 76: 43-50.
|
|
|
|
|||