ecosmak.ru

Determination of the human genome. What is the human genome: decoding

Scientists working to decipher the sequence of the human genetic code said they completed their work two years ahead of schedule.

This announcement comes less than three years after the “draft” genome was published in the world press. In June 2000, British Prime Minister Tony Blair and then US President Bill Clinton announced that 97% of the “book of life” had been deciphered.

Now the human DNA sequence is almost 100% decoded. This leaves small gaps that are considered too costly to fill, but a system capable of drawing medical and scientific conclusions from genetic data is already well established.

The Sanger Institute, the only British institution involved in the large-scale international project, completed almost a third of the total work. No scientific institute in the world has made a greater contribution to deciphering the genome.

According to its director, Professor Alan Bradley, deciphering the human genome is a critical step in a long journey, and the benefits that medicine will eventually receive from this research are truly phenomenal.

“Only one part of our work - the sequence of chromosome 20 - has already allowed us to accelerate the search for genes responsible for the development of diabetes, leukemia and childhood eczema,” says the professor. “We should not expect an immediate breakthrough, but there is no doubt that we are completing one of the most amazing chapters from the book of life."

High standards

An equally significant share of the decoding work fell on the shoulders of American scientists.

Dr Francis Collins, director of the US National Genome Research Institute, also points to the long-term outlook. “One of our projects involved identifying susceptibility genes for type II diabetes,” he says. “One in 20 people over 45 years old suffers from this disease, and this proportion is only increasing over time. Using a publicly available map of genetic sequences, we were able to select one gene on chromosome 20, the presence of which in the genome appears to increase the likelihood of developing type II diabetes."

When the human genome project was officially announced, some experts argued that it would take 20 years or even more to complete. But the progress of the work was incredibly accelerated by the emergence of robotic manipulators and supercomputers. The activity of scientists in this direction was also stimulated by the information that, in parallel, the human genome was being deciphered by the privately funded company Celera Genomics.

In the last three years, the main goal of biologists has been to fill the gaps that remained in the already decoded DNA sequences, and to refine in more detail all other data, on the basis of which a “gold standard” could be developed that would form the basis for further developments in this field.

“We were able to achieve the limits we set in our work much sooner than we had hoped,” said Dr. Jane Rogers, head of DNA sequencing at the Sanger Institute, “while maintaining incredibly high standards of quality. This work allows researchers to immediately "To begin a variety of biomedical projects. They now have a beautifully polished end product that will be of great help to them. It's like going from recording your first music demo tape to working on a full-fledged classical CD."

Knowing almost the entire sequence of almost three billion letter-nucleotides of the genetic code of our DNA, scientists will be able to come to grips with those problems of human life that are caused by genetic reasons.

Back in early April, Sir John Sulston, who has been leading the British part of the project almost from its inception, said that these studies would “unearth human genetic data that can be used forever.”

The work of identifying genes can now take days rather than years as before. But the main task of practical medicine is now to transform knowledge about which genes work incorrectly or cause certain disorders into knowledge of what can be done about it.

And to do this, they will need to better understand how proteins (aka proteins) - complex molecules built according to the genetic “templates” of DNA - interact with each other to build and maintain our body.

The science of genomics already exists and is actively developing, but the science of proteonics is still in its infancy. And here, as Professor Bradley said, there is still a “long way to go”.

Now the human DNA sequence is almost 100% decoded. This leaves small gaps that are considered too costly to fill, but a system capable of drawing medical and scientific conclusions from genetic data is already well established.

The Sanger Institute, the only British institution involved in the large-scale international project, completed almost a third of the total work. No scientific institute in the world has made a greater contribution to deciphering the genome.

According to its director, Professor Alan Bradley, deciphering the human genome is a critical step in a long journey, and the benefits that medicine will eventually receive from this research are truly phenomenal.

“Only one part of our work - the sequence of chromosome 20 - has already allowed us to accelerate the search for genes responsible for the development of diabetes, leukemia and childhood eczema,” says the professor. “We should not expect an immediate breakthrough, but there is no doubt that we are completing one of the most amazing chapters from the book of life."

High standards

An equally significant share of the decoding work fell on the shoulders of American scientists.

Dr Francis Collins, director of the US National Genome Research Institute, also points to the long-term outlook. “One of our projects involved identifying susceptibility genes for type II diabetes,” he says. “One in 20 people over 45 years old suffers from this disease, and this proportion is only increasing over time. Using a publicly available map of genetic sequences, we were able to select one gene on chromosome 20, the presence of which in the genome appears to increase the likelihood of developing type II diabetes."

When the human genome project was officially announced, some experts argued that it would take 20 years or even more to complete. But the progress of the work was incredibly accelerated by the emergence of robotic manipulators and supercomputers. The activity of scientists in this direction was also stimulated by the information that, in parallel, the human genome was being deciphered by the privately funded company Celera Genomics.

In the last three years, the main goal of biologists has been to fill the gaps that remained in the already decoded DNA sequences, and to refine in more detail all other data, on the basis of which a “gold standard” could be developed that would form the basis for further developments in this field.

“We were able to achieve the limits we set in our work much sooner than we had hoped,” said Dr. Jane Rogers, head of DNA sequencing at the Sanger Institute, “while maintaining incredibly high standards of quality. This work allows researchers to immediately "To begin a variety of biomedical projects. They now have a beautifully polished end product that will be of great help to them. It's like going from recording your first music demo tape to working on a full-fledged classical CD."

Knowing almost the entire sequence of almost three billion letter-nucleotides of the genetic code of our DNA, scientists will be able to come to grips with those problems of human life that are caused by genetic reasons.

Back in early April, Sir John Sulston, who has been leading the British part of the project almost from its inception, said that these studies would “unearth human genetic data that can be used forever.”

The work of identifying genes can now take days rather than years as before. But the main task of practical medicine is now to transform knowledge about which genes work incorrectly or cause certain disorders into knowledge of what can be done about it.

And to do this, they will need to better understand how proteins (aka proteins) - complex molecules built according to the genetic “templates” of DNA - interact with each other to build and maintain our body.

The science of genomics already exists and is actively developing, but the science of proteonics is still in its infancy. And here, as Professor Bradley said, there is still a “long way to go”.

It was seven years ago - June 26, 2000. At a joint press conference with the participation of the US President and the British Prime Minister, representatives of two research groups - International Human Genome Sequencing Consortium(IHGSC) and Celera Genomics- announced that work on deciphering the human genome, which began in the 70s, has been successfully completed, and its draft version has been compiled. A new episode of human development has begun - the post-genomic era.

What can deciphering the genome give us, and are the funds and efforts spent worth the results achieved? Francis Collins ( Francis S. Collins), the head of the American Human Genome Program, in 2000 gave the following forecast for the development of medicine and biology in the post-genomic era:

  • 2010 - genetic testing, preventive measures that reduce the risk of diseases, and gene therapy for up to 25 hereditary diseases. Nurses are beginning to perform medical genetic procedures. Preimplantation diagnostics are widely available, and the limitations of this method are actively discussed. The United States has laws to prevent genetic discrimination and respect privacy. Practical applications of genomics are not accessible to everyone, especially in developing countries.
  • 2020 - drugs for diabetes, hypertension and other diseases, developed on the basis of genomic information, are appearing on the market. Cancer therapies are being developed that specifically target the properties of cancer cells in specific tumors. Pharmacogenomics is becoming a common approach for the development of many drugs. Changing the way of diagnosing mental illnesses, the emergence of new methods of treating them, changing the attitude of society towards such diseases. Practical applications of genomics are still not available everywhere.
  • 2030 - determination of the nucleotide sequence of the entire genome of an individual will become a routine procedure, costing less than $1000. Genes involved in the aging process have been cataloged. Clinical trials are being conducted to increase the maximum lifespan of humans. Laboratory experiments on human cells have been replaced by experiments on computer models. Mass movements of opponents of advanced technologies are intensifying in the United States and other countries.
  • 2040 - All generally accepted health measures are based on genomics. Predisposition to most diseases is determined (even before birth). Effective preventative medicine tailored to the individual is available. Diseases are detected at early stages through molecular monitoring.
    Gene therapy is available for many diseases. Replacing drugs with gene products produced by the body in response to therapy. Average life expectancy will reach 90 years due to improved socio-economic conditions. There is serious debate about man's ability to control his own evolution.
    Inequality in the world persists, creating tension at the international level.

As can be seen from the forecast, genomic information in the near future may become the basis for the treatment and prevention of many diseases. Without information about his genes (and it fits on a standard DVD), a person in the future will only be able to cure a runny nose from some healer in the jungle. Does this seem fantastic? But once upon a time, universal vaccination against smallpox or the Internet were just as fantastic (note that it did not exist in the 70s)! In the future, the child’s genetic code will be given to parents in the maternity hospital. Theoretically, with such a disk, treatment and prevention of any ailments of an individual person will become a mere trifle. A professional doctor will be able to make a diagnosis in an extremely short time, prescribe effective treatment, and even determine the likelihood of various diseases appearing in the future. For example, modern genetic tests can already accurately determine the degree of a woman’s predisposition to breast cancer. Almost certainly, in 40–50 years, not a single self-respecting doctor without a genetic code will want to “treat blindly” - just as today surgery cannot do without an X-ray.

Let's ask ourselves the question - is what was said reliable, or maybe in reality everything will be the other way around? Will people finally be able to overcome all diseases and will they achieve universal happiness? Alas. Let's start with the fact that the Earth is small, and there is not enough happiness for everyone. In truth, it is not enough for even half the population of developing countries. “Happiness” is intended mainly for states that are developed in terms of science, in particular biological sciences. For example, a technique with which you can “read” the genetic code of any person has long been patented. This is a well-developed automated technology - although it is expensive and very subtle. If you want, buy a license, or if you want, come up with a new technique. But not all countries have enough money for such development! As a result, a number of states will have medicine that is significantly ahead of the level of the rest of the world. Naturally, in underdeveloped countries the Red Cross will build charitable hospitals, hospitals and genomic centers. And gradually this will lead to the fact that the genetic information of patients in developing countries (which are the majority) will be concentrated in two or three powers that finance this charity. It’s hard to even imagine what can be done with such information. Maybe it's okay. However, another outcome is also possible. The battle over priority that accompanied genome sequencing illustrates the importance of the availability of genetic information. Let's briefly recall some facts from the history of the Human Genome Program.

Opponents of genome decoding considered the task unrealistic, because human DNA is tens of thousands of times longer than the DNA molecules of viruses or plasmids. The main argument against was: “ the project will require billions of dollars that other areas of science will miss, so the genomic project will slow down the development of science as a whole. But if money is found and the human genome is deciphered, then the resulting information will not justify the costs...“However, James Watson, one of the discoverers of the structure of DNA and the ideologist of the program for total reading of genetic information, wittily retorted: “ It's better not to catch a big fish than not to catch a small one", . The scientist’s argument was heard - the genome problem was brought up for discussion in the US Congress, and as a result, the national Human Genome program was adopted.

In the American city of Bethesda, not far from Washington, there is one of the HUGO coordination centers ( Human Genome Organization). The center coordinates scientific work on the topic “Human Genome” in six countries - Germany, England, France, Japan, China and the USA. Scientists from many countries of the world joined the work, united in three teams: two interstate - American Human Genome Project and British from Wellcome Trust Sanger Institute- and a private corporation from Maryland, which entered the game a little later - Celera Genomics. By the way, this is perhaps the first time in biology when a private company competed with intergovernmental organizations at such a high level.

The struggle took place using colossal means and capabilities. As Russian experts noted some time ago, Celera stood on the shoulders of the Human Genome program, that is, it used what had already been done as part of the global project. Really, Celera Genomics I joined the program not at first, but when the project was already in full swing. However, experts from Celera improved the sequencing algorithm. In addition, a supercomputer was built on their order, which made it possible to add the identified “building blocks” of DNA into the resulting sequence faster and more accurately. Of course, all this did not give the company Celera unconditional advantage, but forced her to be considered as a full participant in the race.

Appearance Celera Genomics tensions sharply increased - those who were employed in government programs felt fierce competition. In addition, after the creation of the company, the question of the efficiency of using public investment became acute. At the head Celera became Professor Craig Venter ( Craig Venter), who had extensive experience in scientific work under the state program “Human Genome”. It was he who said that all public programs are ineffective and that his company sequences the genome faster and cheaper. And then another factor appeared - large pharmaceutical companies caught on. The fact is that if all the information about the genome is in the public domain, they will lose intellectual property, and there will be nothing to patent. Concerned about this, they invested billions of dollars in Celera Genomics (which was probably easier to negotiate with). This further strengthened her position. In response to this, the teams of the interstate consortium urgently had to increase the efficiency of genome decoding work. At first the work was uncoordinated, but then certain forms of coexistence were achieved - and the race began to increase its pace.

The finale was beautiful - the competing organizations, by mutual agreement, simultaneously announced the completion of work on deciphering the human genome. This happened, as we already wrote, on June 26, 2000. But the time difference between America and England brought the United States to first place.

Figure 1. The “Race for the Genome,” which involved an intergovernmental and private company, formally ended in a “draw”: both groups of researchers published their achievements almost simultaneously. Head of a private company Celera Genomics Craig Venter published his work in the journal Science co-authored with ~270 scientists who worked under his supervision. The work, carried out by the International Human Genome Sequencing Consortium (IHGSC), was published in the journal Nature, and the full list of authors numbers about 2,800 people working in nearly three dozen centers around the world.

The research lasted a total of 15 years. Creating the first “draft” version of the human genome cost $300 million. However, about three billion dollars have been allocated for all research on this topic, including comparative analyzes and solving a number of ethical problems. Celera Genomics invested about the same amount, although she spent it in just six years. The price is colossal, but this amount is insignificant in comparison with the benefits that the developing country will receive from the final victory over dozens of serious diseases expected soon. In an early October 2002 interview with The Associated Press, President Celera Genomics Craig Venter said one of his non-profit organizations plans to produce CDs containing as much information as possible about a client's DNA. The preliminary cost of such an order is more than 700 thousand dollars. And one of the discoverers of the structure of DNA - Dr. James Watson - was already given two DVDs with his genome worth a total of $1 million this year - as we see, prices are falling. So, the vice president of the company 454 Life Sciences Michael Egholm ( Michael Egholm) reported that the company will soon be able to increase the price of decryption to 100 thousand dollars.

Widespread fame and large-scale funding are a double-edged sword. On the one hand, due to unlimited funds, work progresses easily and quickly. But on the other hand, the result of the research should turn out the way it is ordered. By the beginning of 2001, more than 20 thousand genes had been identified with 100% certainty in the human genome. This figure turned out to be three times less than predicted just two years earlier. A second team of researchers from the US National Institute for Genomic Research, led by Francis Collins, independently obtained the same results - between 20 and 25 thousand genes in the genome of each human cell. However, two other international collaborative research projects added uncertainty to the final estimates. Dr. William Heseltine (chief executive) Human Genome Studies) insisted that their bank contained information about 140 thousand genes. And he is not going to share this information with the world community for now. His company has invested money in patents and plans to make money from the information obtained as it relates to genes for widespread human diseases. Another group claimed that there were 120,000 identified human genes and also insisted that this figure reflected the total number of human genes.

Here it is necessary to clarify that these researchers were engaged in deciphering the DNA sequence not of the genome itself, but of DNA copies of informational (also called template) RNA (mRNA or mRNA). In other words, not the entire genome was studied, but only that part of it that is recoded by the cell into mRNA and directs the synthesis of proteins. Since one gene can serve as a template for the production of several different types of mRNA (which is determined by many factors: cell type, stage of development of the organism, etc.), then the total number of all different mRNA sequences (and this is exactly what the patented Human Genome Studies) will be significantly larger. Most likely, using this number to estimate the number of genes in the genome is simply incorrect.

Obviously, hastily “privatized” genetic information will be carefully checked in the coming years until the exact number of genes finally becomes generally accepted. But what is alarming is the fact that in the process of “cognition” everything that can be patented is patented. It’s not even the skin of a dead bear, but in general everything that was in the den was divided up! By the way, today the debate has slowed down, and the human genome officially contains only 21,667 genes (NCBI version 35, dated October 2005). It should be noted that for now most of the information remains publicly available. Now there are databases that accumulate information about the structure of the genome not only of humans, but also of the genomes of many other organisms (for example, EnsEMBL). However, attempts to obtain exclusive rights to use any genes or sequences for commercial purposes have always been, are now and will continue to be made.

Today, the main goals of the structural part of the program have already been largely fulfilled - the human genome has been read almost completely. The first, "draft" version of the sequence, published in early 2001, was far from perfect. It was missing approximately 30% of the genome sequence as a whole, of which about 10% was the sequence of the so-called euchromatin- gene-rich and actively expressed regions of chromosomes. According to recent estimates, euchromatin makes up approximately 93.5% of the entire genome. The remaining 6.5% comes from heterochromatin- these regions of chromosomes are poor in genes and contain a large number of repeats, which pose serious difficulties for scientists trying to read their sequence. Moreover, DNA in heterochromatin is thought to be inactive and not expressed. (This may explain the “inattention” of scientists to the remaining “small” percentages of the human genome.) But even the “draft” versions of euchromatic sequences available in 2001 contained a large number of breaks, errors, and incorrectly connected and oriented fragments. Without in any way detracting from the significance of this draft for science and its applications, it is worth noting that the use of this preliminary information in large-scale experiments analyzing the genome as a whole (for example, when studying the evolution of genes or the general organization of the genome) revealed many inaccuracies and artifacts. Therefore, further and no less painstaking work, “the last steps”, was absolutely necessary.

Figure 2. Left: Automated line for preparing DNA samples for sequencing at the Whitehead Institute Genomic Research Center. On right: A laboratory in , filled with machines for high-throughput decoding of DNA sequences.

Completing the decryption took several more years and nearly doubled the cost of the entire project. However, already in 2004 it was announced that euchromatin was 99% read with an overall accuracy of one error per 100,000 base pairs. The number of breaks has decreased by 400 times. Accuracy and completeness of reading has become sufficient for an effective search for genes responsible for a particular hereditary disease (for example, diabetes or breast cancer). In practical terms, this means that researchers no longer have to go through the labor-intensive process of confirming the sequences of the genes they are working with, since they can rely entirely on a specific, publicly available sequence of the entire genome.

Thus, the original project plan was significantly exceeded. Has this helped us in understanding how our genome is structured and works? Undoubtedly. Authors of the article in Nature, in which the “final” (as of 2004) version of the genome was published, carried out several analyzes using it, which would have been absolutely meaningless if they had only a “draft” sequence on hand. It turned out that more than a thousand genes were “born” quite recently (by evolutionary standards, of course) - in the process of doubling the original gene and the subsequent independent development of the daughter gene and the parent gene. And just under forty genes have recently “died”, having accumulated mutations that made them completely inactive. Another article published in the same issue of the magazine Nature, directly points out the shortcomings of the method used by scientists from Celera. The consequence of these shortcomings was the omission of numerous repeats in the read DNA sequences and, as a result, an underestimated length and complexity of the entire genome. To avoid repeating similar mistakes in the future, the authors of the article proposed using a hybrid strategy - a combination of a highly effective approach used by scientists from Celera, and the comparatively slow and labor-intensive, but also more reliable method used by the IHGSC researchers.

Where will the unprecedented Human Genome study go next? Something can be said about this now. Founded in September 2003, the international consortium ENCODE ( ENCyclopaedia Of DNA Elements) set as its goal the discovery and study of “control elements” (sequences) in the human genome. Indeed, 3 billion base pairs (namely, the length of the human genome) contain only 22 thousand genes, scattered in this ocean of DNA in a manner incomprehensible to us. What controls their expression? Why do we need such an excess of DNA? Is it really ballast, or does it still manifest itself, possessing some unknown functions?

To begin with, as a pilot project, ENCODE scientists took a "closer look" at a sequence representing 1% of the human genome (30 million base pairs), using the latest equipment for molecular biology research. The results were published in April this year in Nature. It turned out that most of the human genome (including regions previously considered “silent”) serves as a template for the production of various RNAs, many of which are not informational because they do not encode proteins. Many of these “non-coding” RNAs overlap with “classical” genes (sections of DNA that code for proteins). Another unexpected result was how the regulatory DNA regions were located relative to the genes whose expression they controlled. The sequences of many of these regions changed little during evolution, while other regions thought to be important for cell control mutated and changed at unexpectedly high rates during evolution. All these findings have raised a large number of new questions, the answers to which can only be obtained in further research.

Another task, the solution of which will be a matter of the near future, is to determine the sequence of the remaining “small” percentages of the genome that make up heterochromatin, i.e., gene-poor and repeat-rich DNA sections necessary for the doubling of chromosomes during cell division. The presence of repeats makes the task of deciphering these sequences intractable for existing approaches, and, therefore, requires the invention of new methods. Therefore, do not be surprised when another article is published in 2010, announcing the “finishing” of deciphering the human genome - it will talk about how heterochromatin was “hacked.”

Of course, now we only have at our disposal a certain “average” version of the human genome. Figuratively speaking, today we have only the most general description of the design of a car: engine, chassis, wheels, steering wheel, seats, paint, upholstery, gasoline and oil, etc. A closer examination of the result obtained indicates that there are years of work ahead refining our knowledge of each specific genome. The Human Genome Program has not ceased to exist; it is only changing its orientation: from structural genomics there is a transition to functional genomics, designed to determine how genes are controlled and work. Moreover, all people at the gene level differ in the same way that the same car models differ in different versions of the same units. Not only can individual bases in the gene sequences of two different people differ, but also the number of copies of large DNA fragments, sometimes including several genes, can vary greatly. This means that work on a detailed comparison of the genomes of, say, representatives of different human populations, ethnic groups, and even healthy and sick people is coming to the fore. Modern technologies make it possible to quickly and accurately carry out such comparative analyzes, but ten years ago no one dreamed of this. Another international scientific association is studying structural variations in the human genome. In the USA and Europe, significant funds are allocated to finance bioinformatics - a young science that arose at the intersection of computer science, mathematics and biology, without which it is impossible to understand the boundless ocean of information accumulated in modern biology. Bioinformation methods will help us answer many interesting questions - “how did human evolution occur?”, “which genes determine certain characteristics of the human body?”, “which genes are responsible for susceptibility to diseases?” You know what the English say: “ This is the end of the beginning” - “This is the end of the beginning.” This very phrase accurately reflects the current situation. The most important thing begins and - I am absolutely sure - the most interesting: the accumulation of results, their comparison and further analysis.

« ...Today we are releasing the first edition of the “Book of Life” with our instructions, - Francis Collins said on the Rossiya TV channel. - We will turn to it for tens, hundreds of years. And soon people will wonder how they could manage without this information.».

Another point of view can be illustrated by quoting Academician V. A. Kordyum:

“...The hopes that new information about the functions of the genome will be completely open are purely symbolic. It can be predicted that gigantic centers will arise (on the basis of existing ones) that will be able to connect all the data into one coherent whole, a kind of electronic version of Man and implement it practically - into genes, proteins, cells, tissues, organs and anything else. But what? Pleasant to whom? For what? In the process of work on the “human genome” program, methods and equipment for determining the primary DNA sequence were rapidly improved. In the largest centers this turned into a kind of factory activity. But even at the level of individual laboratory devices (or rather, their complexes), such advanced equipment has already been created that it is capable of determining in three months a DNA sequence that is equal in volume to the entire human genome. It is not surprising that the idea of ​​determining the genomes of individual people arose (and immediately began to be rapidly implemented). Of course, it is very interesting to compare the differences of different individuals at the level of their fundamental principles. The benefits of such a comparison are also undoubted. It will be possible to determine who has what abnormalities in the genome, predict their consequences and eliminate what can lead to disease. Health will be guaranteed, and life will be extended quite significantly. This is on the one hand. On the other hand, everything is not at all obvious. Obtaining and analyzing the entire heredity of an individual means obtaining a complete, comprehensive biological dossier on him. It, if desired by the one who knows him, will allow him to do whatever he wants with a person just as comprehensively. According to the already known chain: a cell is a molecular machine; a person is made up of cells; the cell in all its manifestations and in the entire range of possible responses is recorded in the genome; The genome can already be manipulated to a limited extent today, and in the foreseeable future it can be manipulated in almost any way...»

However, it is probably too early to be afraid of such gloomy forecasts (although you certainly need to know about them). To implement them, it is necessary to completely rebuild many social and cultural traditions. Mikhail Gelfand, Doctor of Biological Sciences, said very well about this in an interview. O. Deputy Director of the Institute for Information Transmission Problems of the Russian Academy of Sciences: “ ...if you have, say, one of the five genes that predetermine the development of schizophrenia, then what could happen if this information - your genome - fell into the hands of your potential employer, who does not understand anything about genomics!(and as a result, they may not hire you, considering it risky; and this despite the fact that you do not and will not have schizophrenia - author's note.) Another aspect: with the advent of individualized medicine based on genomics, insurance medicine will completely change. After all, it is one thing to provide for unknown risks, and another thing to provide for completely certain ones. To be honest, the entire Western society as a whole, not only Russian, is not ready for the genomic revolution now...” .

Indeed, in order to use new information wisely, you need to understand it. And in order to understand the genome is not easy to read, this is far from enough - it will take us decades. A very complex picture is emerging, and in order to understand it, we will need to change many stereotypes. Therefore, in fact, deciphering the genome is still ongoing and will continue. And whether we stand aside or finally become active participants in this race depends on us.

Literature

  1. Kiselev L. (2001). New Biology began in February 2001. "Science and life";
  2. Kiselev L. (2002). The second life of the genome: from structure to function. "Knowledge is power". 7 ;
  3. Ewan Birney, The ENCODE Project Consortium, John A. Stamatoyannopoulos, Anindya Dutta, Roderic Guigó, et. al.. (2007). Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 447 , 799-816;
  4. Lincoln D. Stein. (2004). Human genome: End of the beginning. Nature. 431 , 915-916;
  5. Gelfand M. (2007). Postgenomic era. "Commercial Biotechnology".

The international Human Genome Project was launched in 1988. This is one of the most labor-intensive and expensive projects in the history of science. If in 1990 about 60 million dollars were spent on it in total, then in 1998 the US government alone spent 253 million dollars, and private companies - even more. The project involves several thousand scientists from more than 20 countries. Since 1989, Russia has also participated in it, where about 100 groups are working on the project. All human chromosomes are divided between the participating countries, and Russia received the 3rd, 13th and 19th chromosomes for research.

The main goal of the project is to find out the sequence of nucleotide bases in all human DNA molecules and establish localization, i.e. completely map all human genes. The project includes as subprojects the study of the genomes of dogs, cats, mice, butterflies, worms and microorganisms. The researchers are then expected to determine all of the genes' functions and develop ways to use the findings.

What is the main subject of the project – the human genome?

It is known that in the nucleus of each somatic cell (in addition to the DNA nucleus there is also mitochondria) of a person there are 23 pairs of chromosomes, each chromosome is represented by one DNA molecule. The total length of all 46 DNA molecules in one cell is approximately 2 m; they contain about 3.2 billion nucleotide pairs. The total length of DNA in all cells of the human body (there are approximately 5x1013) is 1011 km, which is almost a thousand times greater than the distance from the Earth to the Sun.

How do such long molecules fit into the nucleus? It turns out that in the nucleus there is a mechanism for “forced” DNA laying in the form of chromatin - levels of compaction.

The first level involves the organization of DNA with histone proteins - the formation of nucleosomes. Two molecules of special nucleosomal proteins form an octamer in the form of a coil on which a DNA strand is wound. One nucleosome contains about 200 base pairs. Between the nucleosomes there remains a DNA fragment of up to 60 base pairs in size, called a linker. This level of folding makes it possible to reduce the linear dimensions of DNA by 6–7 times.

At the next level, nucleosomes are arranged into a fibril (solenoid). Each turn consists of 6-7 nucleosomes, while the linear dimensions of DNA are reduced to 1 mm, i.e. 25-30 times.

The third level of compaction is the looping of fibrils—the formation of loop domains that extend at an angle from the main axis of the chromosome. They can be seen under a light microscope as interphase "lampbrush" chromosomes. The cross-striation characteristic of mitotic chromosomes reflects, to some extent, the order of genes in the DNA molecule.

If in prokaryotes the linear sizes of a gene are consistent with the sizes of the structural protein, then in eukaryotes the sizes of DNA are much greater than the total sizes of significant genes. This is explained, firstly, by the mosaic, or exon-intron, structure of the gene: fragments to be transcribed - exons - are interspersed with insignificant regions - introns. The gene sequence is first completely transcribed by the synthesized RNA molecule, from which the introns are then cut out, the exons are stitched together, and in this form the information from the mRNA molecule is read on the ribosome. The second reason for the colossal size of DNA is the large number of repeated genes. Some are repeated tens or hundreds of times, while others have up to 1 million repeats per genome. For example, the gene encoding rRNA is repeated about 2 thousand times.

Back in 1996, it was believed that a person has about 100 thousand genes; now bioinformatics experts suggest that there are no more than 60 thousand genes in the human genome, and they account for only 3% of the total length of cell DNA, and the functional role of the rest 97% not yet installed.

What are the achievements of scientists in just over ten years of work on the project?

The first major success was the complete mapping of the genome of the bacterium Haemophilus influenzae in 1995. Later, the genomes of more than 20 bacteria were fully described, including the causative agents of tuberculosis, typhus, syphilis, etc. In 1996, the DNA of the first eukaryotic cell, yeast, was mapped, and in 1998, the genome of a multicellular organism, the roundworm Caenorhabolitis elegans, was mapped for the first time . By 1998, nucleotide sequences had been established in 30,261 human genes, i.e. About half of human genetic information has been deciphered.

Below is the known data on the number of genes involved in the development and functioning of certain human organs and tissues.

Name of organ, tissue, cell and number of genes

1. Salivary gland 17

2. Thyroid gland 584

3. Smooth muscle 127

4. Mammary gland 696

5. Pancreas 1094

6. Spleen 1094

7. Gallbladder 788

8. Small intestine 297

9. Placenta 1290

10. Skeletal muscle 735

11. White blood cell 2164

12. Testis 370

13. Leather 620

14. Brain 3195

15. Eye 547

16. Lungs 1887

17. Heart 1195

18. Red blood cell 8

19. Liver 2091

20. Uterus 1859

In recent years, international data banks have been created on nucleotide sequences in the DNA of various organisms and on amino acid sequences in proteins. In 1996, the International Sequencing Society decided that any newly determined nucleotide sequence of 1–2 thousand bases or more in size must be made public via the Internet within 24 hours after it has been deciphered, otherwise articles with this data will not be published in scientific journals are accepted. Any specialist in the world can use this information.

During the implementation of the Human Genome Project, many new research methods were developed, most of which have recently been automated, which significantly speeds up and reduces the cost of DNA decoding. These same analysis methods can be used for other purposes: in medicine, pharmacology, forensics, etc.

Let us dwell on some specific achievements of the project, primarily, of course, related to medicine and pharmacology.

In the world, every hundredth child is born with some kind of hereditary defect. To date, about 10 thousand different human diseases are known, more than 3 thousand of which are hereditary. Mutations have already been identified that are responsible for diseases such as hypertension, diabetes, some types of blindness and deafness, and malignant tumors. Genes responsible for one of the forms of epilepsy, gigantism, etc. have been discovered. Below are some diseases that arise as a result of damage to genes, the structure of which was completely deciphered by 1997.

Diseases resulting from gene damage

1. Chronic granulomatosis
2. Cystic fibrosis
3. Wilson's disease
4. Early breast/ovarian cancer
5. Emery-Dreyfus muscular dystrophy
6. Atrophy of the spinal muscles
7. Albinism of the eye
8. Alzheimer's disease
9. Hereditary paralysis
10. Dystonia

It is likely that in the coming years, ultra-early diagnosis of serious diseases will become possible, and therefore a more successful fight against them. Currently, methods are being actively developed for targeted delivery of drugs to affected cells, replacing diseased genes with healthy ones, turning on and off side metabolic pathways by turning on and off the corresponding genes. Examples of successful use of gene therapy are already known. For example, it was possible to achieve significant relief in the condition of a child suffering from severe congenital immunodeficiency by introducing normal copies of the damaged gene into him.

In addition to disease-causing genes, some other genes have been discovered that are directly related to human health. It turned out that there are genes that cause a predisposition to the development of occupational diseases in hazardous industries. Thus, in asbestos production, some people get sick and die from asbestosis, while others are resistant to it. In the future, it is possible to create a special genetic service that will provide recommendations on possible professional activities in terms of predisposition to occupational diseases.

It turned out that a predisposition to alcoholism or drug addiction can also have a genetic basis. Seven genes have already been discovered, the damage of which is associated with the emergence of dependence on chemical substances. A mutant gene was isolated from the tissues of patients with alcoholism, which leads to defects in cellular dopamine receptors, a substance that plays a key role in the functioning of the pleasure centers of the brain. A lack of dopamine or defects in its receptors is directly related to the development of alcoholism. A gene was discovered on the fourth chromosome, mutations of which lead to the development of early alcoholism and already in early childhood manifest themselves in the form of increased child mobility and attention deficit.

Interestingly, gene mutations do not always lead to negative consequences - they can sometimes be beneficial. Thus, it is known that in Uganda and Tanzania, AIDS infection among prostitutes reaches 60–80%, but some of them not only do not die, but also give birth to healthy children. Apparently, there is a mutation (or mutations) that protects a person from AIDS. People with this mutation can be infected with the immunodeficiency virus but do not develop AIDS. A map has now been created to roughly reflect the distribution of this mutation in Europe. It is especially common (in 15% of the population) among the Finno-Ugric group of the population. The identification of such a mutant gene could lead to the creation of a reliable way to combat one of the most terrible diseases of our century.

It also turned out that different alleles of the same gene can cause different reactions of people to drugs. Pharmaceutical companies plan to use this data to produce specific drugs intended for different patient groups. This will help eliminate adverse reactions from drugs, or more precisely, understand the mechanism of their action, and reduce millions of costs. A whole new branch of pharmacogenetics studies how certain features of DNA structure can weaken or enhance the effects of drugs.

Decoding the genomes of bacteria makes it possible to create new effective and harmless vaccines and high-quality diagnostic drugs.

Of course, the achievements of the Human Genome Project can be used not only in medicine or pharmaceuticals.

DNA sequences can be used to determine the degree of relationship between people, and mitochondrial DNA can be used to accurately establish maternal kinship. A method of “genetic fingerprinting” has been developed, which allows you to identify a person by trace amounts of blood, skin flakes, etc. This method has been successfully used in forensic science - thousands of people have already been acquitted or convicted on the basis of genetic analysis. Similar approaches can be used in anthropology, paleontology, ethnography, archeology, and even in such a seemingly distant field from biology as comparative linguistics.

As a result of the research, it became possible to compare the genomes of bacteria and various eukaryotic organisms. It turned out that in the process of evolutionary development the number of introns in organisms increases, i.e. evolution is associated with the “dilution” of the genome: per unit length of DNA there is less and less information about the structure of proteins and RNA (exons) and more and more regions that do not have a clear functional significance (introns). This is one of the great mysteries of evolution.

Previously, evolutionary scientists distinguished two branches in the evolution of cellular organisms: prokaryotes and eukaryotes. As a result of comparison of genomes, it was necessary to distinguish archaebacteria into a separate branch - unique single-celled organisms that combine the characteristics of prokaryotes and eukaryotes.

Currently, the problem of the dependence of a person’s abilities and talents on his genes is also being intensively studied. The main task of future research is to study single-nucleotide DNA variations in cells of different organs and identify differences between people at the genetic level. This will make it possible to create gene portraits of people and, as a result, more effectively treat diseases, assess the abilities and capabilities of each person, identify differences between populations, assess the degree of adaptation of a particular person to a particular environmental situation, etc.

Finally, it is necessary to mention the danger of disseminating genetic information about specific people. In this regard, some countries have already passed laws prohibiting the dissemination of such information, and lawyers around the world are working on this problem. In addition, the Human Genome Project is sometimes associated with the revival of eugenics at a new level, which also causes concern among experts.

The analysis of the human genome has been completed.

In Washington on April 6, 2000, a meeting of the US Congress Science Committee was held, at which Dr. J. Craig Venter announced that his company, Celera Genomics, had completed deciphering the nucleotide sequences of all the necessary fragments of the human genome. He expects that the preliminary work to sequence all the genes (there are about 80 thousand of them, and they contain approximately 3 billion DNA “letters”) will be completed in 3-6 weeks, i.e. much earlier than planned. Most likely, the final sequencing of the human genome will be completed by 2003.

Celera joined research on the Human Genome Project 22 months ago. Her approaches were initially criticized by the so-called open consortium of project participants, but a subproject she completed last month to decipher the fruit fly genome showed their effectiveness.

This time, no one criticized the forecasts of K. Venter, made by him in the presence of the US Presidential Adviser on Science, Dr. N. Lane, and the representative of the consortium, the largest genome sequencing specialist, Dr. Robert Waterston.

The preliminary map of the genome will contain about 90% of all genes, but, nevertheless, it will be of great help in the work of scientists and doctors, since it will allow them to quite accurately find the necessary genes. Dr Venter said he now plans to use his 300 sequencers to analyze the mouse genome, knowledge of which will help understand how human genes work.

The deciphered genome belongs to a man, and therefore contains both X and Y chromosomes. The name of this person is not known, and it does not matter because... Extensive data on individual DNA variation has been and continues to be collected by both Celera and a consortium of researchers. By the way, the consortium uses genetic material obtained from various people in its research. Dr. Venter characterized the results obtained by the consortium as 500 thousand deciphered but not sequenced fragments, from which it would be very difficult to construct entire genes.

Dr. Venter said that once the structure of the genes is determined, he will organize a conference to involve outside experts in identifying the position of genes in DNA molecules and determining their functions. After this, other researchers will have free access to human genome data.

Negotiations were held between Venter and a consortium of researchers about the joint publication of their results, and one of the main points of the agreement was to provide that patenting genes was possible only after their functions and position in DNA had been accurately determined.

However, negotiations were interrupted due to disagreements over what should be considered the completion of genome sequencing. The problem is that in the DNA of eukaryotes, unlike the DNA of prokaryotes, there are fragments that cannot be deciphered by modern methods. The size of such fragments can range from 50 to 150 thousand bases, but fortunately these fragments contain very few genes. At the same time, in DNA regions rich in genes, there are fragments that also cannot yet be deciphered.

Determination of the position and functions of genes is supposed to be carried out using special computer programs. These programs will analyze the structure of genes and, comparing it with data on the genomes of other organisms, suggest options for their possible functions. According to Celera, the work can be considered complete if the genes are almost completely identified and it is known exactly how the deciphered fragments are located on the DNA molecule, i.e. in what order. The Celera results satisfy this definition, while the consortium's results do not allow us to unambiguously determine the position of the deciphered sections relative to each other.

Once a complete map of the human genome has been compiled, Celera plans to make this data available to other researchers by subscription, while for universities the fee for using the data bank will be very low, $5-15 thousand per year. This will provide serious competition to the university-owned Genbank database.

Participants at the Science Committee meeting were highly critical of companies such as Incyte Pharmaceuticals and Human Genome Sciences, which nightly copied the consortium's data available on the Internet and then applied to patent all the genes they discovered in those sequences.

When asked whether data on the human genome could be used to create a new type of biological weapon, for example, dangerous only for some populations, Dr. Venter replied that data on the genomes of pathogenic bacteria and viruses pose a much greater danger. When asked by one of the congressmen whether targeted changes in the human race would now become a reality, Dr. Venter replied that it may take about a hundred years to fully determine the functions of all genes, and until then there is no talk of targeted changes in the genome.

Let us recall that in December 1999, researchers from Great Britain and Japan announced the establishment of the structure of the 22nd chromosome. This was the first human chromosome to be decoded. It contains 33 million base pairs, and 11 sections (about 3% of the DNA length) remain undeciphered in its structure. The functions of approximately half of the genes have been determined for this chromosome. It has been established, for example, that defects in this chromosome are associated with 27 different diseases, including schizophrenia, myeloid leukemia and trisomy 22, the second leading cause of miscarriages in pregnant women.

At the time, British scientists sharply criticized the sequencing methods used by Celera, believing that they would take too long to decipher the sequences and determine the relative positions of their fragments. Then, based on the known volume of decoded material, predictions were made that the 7th, 20th and 21st chromosomes would be mapped next.

A week after the announcement of the completion of deciphering the nucleotide sequences in the human genome, a meeting of the American Association for the Advancement of Science took place, at which US Secretary of Energy Bill Richardson announced that scientists from the Joint Genome Institute had determined the structures of the 5th, 16th and 19th human chromosomes .

These chromosomes contain approximately 300 million base pairs, which is 10-15 thousand genes, or about 11% of human genetic material. So far, it has been possible to map 90% of the DNA of these chromosomes; there are still areas that cannot be deciphered, containing a small number of genes.

Chromosome maps reveal genetic defects that can lead to certain kidney diseases, prostate and colorectal cancer, leukemia, hypertension, diabetes and atherosclerosis. According to Richardson, closer to the summer, information about the structure of chromosomes will be available to all researchers for free.



The principles of heredity were first identified in the 1900s, when natural principles developed and the concepts of the human genome and the gene in particular were introduced into use (with a full definition). Their research enabled scientists to discover the secret of heredity, and became the impetus for the study hereditary diseases and their nature.

In contact with

Human genome: general concepts

To understand what genes are and the processes of inheritance by an organism of certain properties and qualities, you should know and understand the terms and basic provisions. A brief summary of the basic concepts will provide an opportunity to delve deeper into this topic.

Human genes are parts of a chain (deoxyribonucleic acid in the form of macromolecules) that determines the sequence of certain polypeptides (families of amino acids) and carries basic hereditary information from parents to children.

In simple terms, a certain gene contains information about the structure of a protein and carries it from the parent to the child, repeating the structure of polypeptides and transmitting heredity.

Human genome is a general concept that denotes a certain number of specific genes. It was first introduced by Hans Winkler in 1920, but over time its original meaning changed somewhat.

At first it designated a certain number of chromosomes (unpaired and single), and after a while it turned out that the genome had 23 paired chromosomes and mitochondrial deoxyribonucleic acid.

Genetic information is data that is contained in DNA and carries the order of protein construction in the form of a code of nucleotides. It's also worth mentioning that this kind of information is found inside and outside the boundaries.

Human genes have been studied for many years, during which it was brought to life many experiments. Experiments are still being carried out that provide scientists with new information.

Recent research has made it clear that deoxyribonucleic acids do not always have a clear and consistent structure.

There are so-called discontinuous genes, the connections of which are interrupted, which makes all previous theories about the constancy of these particles incorrect. Changes occur in them from time to time, which entail changes in the structure of deoxyribonucleic acids.

History of discovery

The scientific term was first designated only in 1909 by the scientist Vilhelm Johansen, who was an outstanding botanist in Denmark.

Important! In 1912, the word “genetics” appeared, which became the name of an entire department. He is the one who studies human genes.

Particle research has begun long before the 20th century(there is no data for which exact year), and consisted of several stages:

  1. In 1868, the famous scientist Darwin put forward the hypothesis of pangenesis. In it he described the separation of the gemmule. Darwin believed that a gemmule is a specific part of a cell from which sex cells are then formed.
  2. A few years later, Hugo de Vries formed his own theory, different from Darwin's, in which he described the process of pangenesis inside cells. He believed that there is a particle in every cell, and it is responsible for certain properties of the inheritance of the species. He designated these particles as "pangenes". Differences between the two hypotheses is that Darwin considered gemmules to be parts of tissues and internal organs, regardless of the type of animal, and de Vries presented his pangenes as signs of inheritance within a particular species.
  3. W. Johansen in 1900 defined the hereditary factor as a gene, taking the second part from the term used by de Vries. He used the word to define "germ", that particle which is hereditary. At the same time, the scientist emphasized the independence of the term from previously put forward theories.

Biologists and zoologists have been studying the hereditary factor for quite a long time, but only from the beginning of the 20th century did genetics begin to develop at tremendous speed, revealing the secrets of inheritance to people.

Decoding the human genome

From the moment scientists discovered the presence of a gene in the human body, they began to investigate the question of the information contained in it. For more than 80 years, scientists have been trying to decipher it. To date, they have achieved significant success in this, which has given opportunity to influence on hereditary processes and change the structure of cells in the next generation.

The history of DNA decoding consists of several defining moments:

  1. 19th century - the beginning of the study of nucleic acids.
  2. 1868 - F. Miescher first isolates nuclein or DNA from cells.
  3. In the middle of the 20th century, O. Avery and F. Griffith found out, using experiments conducted on mice, that it is nucleic acid that is responsible for the process of bacterial transformation.
  4. The first person to show DNA to the world was R. Franklin. Several years after the discovery of nucleic acid, he takes a photograph of DNA, randomly using X-rays while examining the structure of crystals.
  5. In 1953, a precise definition was given to the principle of reproduction of life in all species.

Attention! Since the DNA double helix was first introduced to the public, many discoveries have been made that provide insight into the nature of DNA and how it works.

by a man who discovered the gene Gregor Mendel is considered to be the first to discover certain patterns in the hereditary chain.

But the decoding of human DNA was based on the discovery of another scientist, Frederick Sanger, who developed methods for reading protein amino acid sequences and the sequence of constructing DNA itself.

Thanks to the work of many scientists over the last three centuries, the formation processes, features, and how many genes are in the human genome have been clarified.

In 1990 it began international project The Human Genome, directed by James Watson. His goal was to find out in what sequence the nucleotides in DNA are arranged and to identify about 25,000 genes in humans. Thanks to this project, a person was supposed to gain a complete understanding of the formation of DNA and the location of all its constituent parts, as well as the mechanism of gene construction.

It is worth clarifying that the program did not set out to determine the entire nucleic acid sequence in cells, but only some areas. It began in 1990, but it was not until 2000 that a draft of the work was released, and the full study completed - in 2003. Sequence research is still ongoing and 8% of heterochromatic regions are still unidentified.

Goals and objectives

Like any scientific project, the Human Genome set itself specific goals and objectives. Initially, scientists intended to identify sequences of 3 billion nucleotides or more. Then, separate groups of researchers expressed a desire to simultaneously determine the sequence of biopolymers, which can be amino acid or nucleotide. Eventually main goals of the project looked like this:

  1. Create a genome map;
  2. Create a map of human chromosomes;
  3. Identify the sequence of formation of polypeptides;
  4. Create a methodology for storing and analyzing the collected information;
  5. Create technology that will help achieve all of the above goals.

This list of tasks misses an equally important, but not so obvious one - the study of the ethical, legal and social consequences of such research. The issue of heredity can cause disagreement among people and lead to serious conflicts, so scientists have made it their goal to discover solutions to these conflicts before they arise.

Achievements

Hereditary sequences are unique phenomenon, which is observed in the body of every person in one form or another.

The project achieved all its goals earlier than the researchers expected. By the end of the project, they had deciphered about 99.99% of the DNA, although the scientists set themselves the task of sequencing only 95% of the data . Today, despite the success of the project, there are still unexplored areas deoxyribonucleic acids.

As a result of the research work, it was determined how many genes are in the human body (about 20-25 thousand genes in the genome), and all of them were characterized:

  • quantity;
  • location;
  • structural and functional features.

Human genome - research, decoding

Decoding the human genome

Conclusion

All data will be presented in detail in the genetic map of the human body. The implementation of such a complex scientific project not only provided enormous theoretical knowledge for the fundamental sciences, but also had an incredible impact on the very understanding of heredity. This, in turn, could not but affect the processes of prevention and treatment of hereditary diseases.

The scientists' findings helped speed up other molecular research and contribute to effective search for the genetic basis in inherited diseases and predisposition to them. The results can influence the discovery of appropriate drugs for the prevention of many diseases: atherosclerosis, cardiac ischemia, mental illness and cancer.

Loading...