Genomics and Artificial Intelligence Technologies
At Produvia, we produce intelligent software. We also write letters about artificial intelligence (AI) to founders, executives, and decision-makers from all industries. These letters are meant to inspire and motivate companies, government agencies, and countries on the topics of AI, machine learning, and deep learning technologies.
At Produvia, we believe that artificial intelligence technologies will fundamentally change how genomics, biotechnology, and life sciences startups and companies turn data into actionable insights.
Before we talk about artificial intelligence, it is important to understand the genomics industry first.
Genomics Industry
The global genomics industry is worth $16.4 Billion USD as of 2018 and is expected to reach $41.2 Billion USD by 2025. The genomics industry consists of genomic products and services. The genomic products are expected to dominate the market due to the recurrent use of instruments and reagents for genomics research and the rising number of research programs undertaken by government and private organizations. The genomics services include next-generation sequencing, core genomics, biomarker translations, and many others. [1]
According to AngelList, there are 160+ genomic; 4,228+ biotechnology; 9,893+ life sciences; and 4,893,827+ startups around the world [2-5]. In other words, genomic startups represent about 4 percent of the biotechnology industry, 2 percent of the life sciences industry, and 3/1000 percent of all startups.
Today, the genomics industry is booming thanks to the increasing amount of data. Genomics data in the next 10 years is projected to equal and surpass other data-intensive disciplines including social media and online videos. [6]
Artificial Intelligence in Genomics
At Produvia, we predict that genomic startups that combine deep learning, computer vision, and natural language processing technologies will establish a competitive edge in the marketplace.
Deep learning, a sub-field of artificial intelligence, is combined with computer vision techniques to analyze the growing amount of genomics imagery data. In computer vision, deep learning algorithms that excel include convolutional neural networks and recurrent neural networks. These machine learning models are solving computer vision tasks such as image classification, semantic segmentation, and image retrieval.
Deep learning is also combined with natural language processing techniques to analyze the expanding amount of genomics-related text found in publically-available research papers. Deep neural networks are solving tasks such as named entity recognition, relation extraction, and information retrieval. Deep learning technologies are ideally suited to deal with natural language processing tasks since they offer state-of-the-art performance and overcome challenges with feature engineering.
At Produvia, we recognize the complexity of artificial intelligence in its applications in the genomics industry. As a result, we wrote this article as a guide for any stakeholders including patients, research participants, public, providers, researchers, advocacy groups, payers, and policymakers.
How AI and Genomics Will Save The Planet
In 2015, the United Nations (UN) set seventeen Global Goals, also known as Sustainable Development Goals (SDGs). The SDGs were adopted by all UN Member States as a universal call to action end poverty, protect the planet and ensure that all people enjoy peace and prosperity by 2030. [6]
Of the seventeen SDGs, the Produvia team identified five goals that can be solved with genomics and artificial intelligence technologies.
AI Goal #1: No Poverty
Can we really end poverty? Can we grow the middle class? These are really hard questions to answer. Satellite imagery was combined with machine learning to predict poverty [7]. Poverty has been linked to disease, chronic illness, childhood obesity, elevated blood lead levels, academic achievements, and DNA methylation [8-12]. How can machine learning help with these genomic causations or correlations? If we can predict disease or DNA methylation across genes, we can take preventative action in the fight against poverty.
AI Goal #2: Zero Hunger
How can humanity end hunger? Can we achieve a stable food supply? Can we end hidden hunger, also known as micronutrient deficiency? Certain hormones that regulate hunger and satiety [13]. Hunger can be detected in crying infants using deep learning [14]. Analyzing how people eat or their consumption patterns can reveal hidden hunger or gaps in micronutrient deficiency. Can people improve nutrition and promote sustainable agriculture? To answer these questions, consider that plant breeding and other agricultural technologies are greatly improved using machine learning. Increasing crop yield production will close the gap between crop output and hunger. Genetically improving cultivars and improving agronomic practices is one way to increase crop productivity [15]. If we make agricultural more productive, we can reduce world hunger.
AI Goal #3: Good Health and Well-Being
Can we live a healthier life? Can we promote the well-being of all humanity? Better detection of AIDS, tuberculosis, malaria and neglected tropical diseases are now possible thanks to deep learning. Imagine being able to create personalized genomic profiles of each person on earth. This will allow us to predict the outbreak of diseases knowing where the susceptibility lies. Humanity has the potential to edit human reproduction. With gene editing, we can create the next generation of humans, which are immune to the latest diseases and typical health conditions. Combing gene editing with machine learning will allow humanity to achieve customized genetic and genomic profiles of individuals. If we can better understand how the aging process affects health and longevity, we can create healthier societies. Today, we can use deep learning to detect changes in biomarkers (i.e., physiological variables, composite indices) using data from longitudinal studies.
AI Goal #4: Life Below Water
Can we conserve ocean life? Can humanity use the oceans, seas and marine resources sustainably? Genomics and machine learning can solve many problems to ensure the continuation of life below water. For example, we can classify ocean acidity to reduce declining fish stock. We can apply conservation genomics with deep learning technologies, to predict the biodiversity of living organisms. Can we improve our aquaculture? Over the past few decades, advancements in agricultural biotechnology have changed the way research is analyzed. Today, genomic data using is analyzed using a variety of computational tools including machine learning or deep learning.
AI Goal #5: Life on Land
Can we protect our ecosystem? Can we restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests, combat desertification? Lastly, can humanity halt and reverse land degradation and halt biodiversity loss? Understanding complex ecosystems and how genes are affected by the environment is possible thanks to machine learning technologies. Deep learning can meet genome-scale metabolic modeling [16]. Machine learning technologies have demonstrated the ability to analyze large, complex biological data. Furthermore, the massive and rapid advancements in both biological data generation and machine learning methodologies are promising for further understanding of genomics and biological data. It’s now possible to classify microbial roles in ecosystems using deep learning [17]. Genomic tools, such as population genomics, meta-omics, and genome editing, can also restore ecosystems and biodiversity. Meta-omics can improve the assessment and monitoring of restoration outcomes. Gene editing can generate novel genotypes for restoring challenging environments. Using machine learning to analyze population genomics, meta-omics, and genome editing data will aid companies in developing solutions to improve life on earth.
AI Research in Genomics
Artificial intelligence research is driving technological breakthroughs all industry verticals, genomics included. Reading academic papers takes time and the technical language is not easy to understand. At Produvia, on the other hand, we keep up-to-date with the latest academic research papers so you don’t have to. Below, we highlight 20 AI and machine learning use cases for genomics [18-26]:
Genomics
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. Here are five AI and machine learning applications for genomics:
Extract genomic and epigenomic variants of clinical utility
Identify genes
Predict genomic associations
Predict protein functions
Predict sequence the specificity of DNA and RNA-binding proteins
Regulatory Genomics
Regulatory genomics is the study of genomic regions or features and how they regulate genes. At Produvia, we list five AI and machine learning applications for regulatory genomics:
Classify gene expression
Predict gene expression from genotype
Predict promoters and enhancers
Predict splicing
Predict transcription factors and RNA-binding proteins
Functional Genomics
The field of molecular biology that attempts to describe gene functions and interactions is functional genomics. Here are five AI applications for functional genomics:
Classify mutations and functional activities
Classify subcellular localization
Predict promoters and enhancers
Predict splicing
Predict transcription factors and RNA-binding proteins
Structural Genomics
Structural genomics is the field of genomics that involves the characterization of genome structures. At Produvia, we list five AI and machine learning applications for structural genomics:
Classify protein tertiary structures
Classify structures of proteins
Predict contact maps
Predict physical properties
Predict protein secondary structures
AI Ideas for Genomics
You’re interested in artificial intelligence and machine learning, but don’t know where to start. At Produvia, we brainstormed several ideas for the application of artificial intelligence technologies in genomics. Here are thirty-five AI ideas for genomics:
Annotate genes based on structure and chromosomes
Classify cancer from gene expression profiles
Classify genes
Classify genomic profiles
Classify mutation types
Design targetted therapies
Detect deoxyribonucleic acid regions that are predictive of gene expression
Determine relationships between genotypes and phenotype
Discover drugs for genomic medicine
Distinguish between cancer and adenoma
Estimate prevalence for chromatin marks
Extract transcriptome patterns
Identify biomarkers for a disease
Identify enhancers
Identify pairwise variable associations between genomic data types
Identify positioned nucleosomes
Identify potentially valuable disease biomarkers
Identify promoters
Identify subtype of breast cancer tumor
Identify transcription factor binding sites
Identify transcription start sites, splice sites, exons
Interpret regulatory control in single cells
Model regulatory elements
Partition and label the genome with chromatin state annotation
Predict chromatin marks from deoxyribonucleic acid sequences
Predict disease phenotype or prognosis
Predict gene function
Predict genetic interactions
Predict protein backbones from protein sequences
Predict regulatory functions and relationships
Predict sequence the specificity of enhancer and cis-regulatory regions
Predict the specificities of deoxyribonucleic acid-binding and ribonucleic acid-binding proteins
Predict the splicing activity of individual exons
Predict variant deleteriousness
Quantify effects of single nucleotide variants on chromatin accessibility
Challenges and Opportunities in Genomics
The use of artificial intelligence technologies to solve genomics problems poses many challenges. These industry challenges also present opportunities for AI technology providers, such as Produvia, to solve market problems and create AI solutions. Below, we list three genomics opportunities:
Generating ground-truth labels or genomics datasets can be expensive
“Right to an explanation” laws must be addressed
Longitudinal studies are required
How can AI companies overcome these challenges? At Produvia, we believe that industry collaboration will overcome Challenge #1, algorithmic transparency will overcome Challenge #2, and long-term research projects will overcome Challenge #3.
Conclusion
The combination of artificial intelligence technologies and genomics has the potential to end poverty, end hunger, protect, restore and promote aquatic and terrestrial ecosystems.
Next Step
Are you interested in solving genomics problems?
Schedule a discovery call with Slava Kurilyak, Founder/CEO at Produvia.
Slava Kurilyak helps purpose-driven organizations to increase revenue and decrease expenses by developing artificial intelligence solutions that drive impact.
At Produvia, we serve companies with $1+ million dollars in revenue to accelerate the development of artificial intelligence technologies.
References
Research, Z. (2019). Global Genomics Market Will Reach USD 41.2 Billion By 2025: Zion Market Research. GlobeNewswire News Room. Retrieved 1 September 2019, from https://www.globenewswire.com/news-release/2019/04/10/1801776/0/en/Global-Genomics-Market-Will-Reach-USD-41-2-Billion-By-2025-Zion-Market-Research.html
Genomics Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website: https://angel.co/genomics-2
Biotechnology Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website: https://angel.co/biotechnology
Life Sciences Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website: https://angel.co/life-sciences
All Startups Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website:https://angel.co/all-markets
dpicampaigns. (2018). About the Sustainable Development Goals — United Nations Sustainable Development. Retrieved October 26, 2019, from United Nations Sustainable Development website: https://www.un.org/sustainabledevelopment/sustainable-development-goals/
Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794. https://doi.org/10.1126/science.aaf7894
Global genomics disparities in the wake of personalised medical services: International Journal of Medical Engineering and Informatics: Vol 1, No 4. (2009). Retrieved October 27, 2019, from International Journal of Medical Engineering and Informatics website: https://www.inderscienceonline.com/doi/abs/10.1504/IJMEI.2009.026812
Newacheck, P. W. (1994). Poverty and Childhood Chronic Illness. Archives of Pediatrics & Adolescent Medicine, 148(11), 1143. https://doi.org/10.1001/archpedi.1994.02170110029005
Chokshi, D. A. (2018). Income, Poverty, and Health Inequality. JAMA, 319(13), 1312. https://doi.org/10.1001/jama.2018.2521
Wexler, B. E., Imal, Ahmet Esat, Pittman, B., & Bell, M. D. (2019). Executive Function Deficits Mediate Effects of Poverty on Academic Achievement: An Important Target for Interventions to Enhance Neurocognitive Development in At-Risk Children. Retrieved October 27, 2019, from Ssrn.com website: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3369774
Poverty leaves a mark on our genes. (2019). Retrieved October 26, 2019, from Northwestern.edu website: https://news.northwestern.edu/stories/2019/04/poverty-leaves-a-mark-on-our-genes/
Mesmar, B., & Steinle, N. (2020). Genomics of Eating Behavior and Appetite Regulation. Principles of Nutrigenetics and Nutrigenomics, 159–165. https://doi.org/10.1016/b978-0-12-804572-5.00020-3
Barajas-Montiel, S. E., & Reyes-Garcia, C. A. (2019). Identifying Pain and Hunger in Infant Cry with Classifiers Ensembles. International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06). https://doi.org/10.1109/cimca.2005.1631561
Borrill, P., Harrington, S. A., & Uauy, C. (2018). Applying the latest advances in genomics and phenomics for trait discovery in polyploid wheat. The Plant Journal. https://doi.org/10.1111/tpj.14150
Zampieri, G., Vijayakumar, S., Yaneske, E., & Angione, C. (2019). Machine and deep learning meet genome-scale metabolic modeling. PLOS Computational Biology, 15(7), e1007084. https://doi.org/10.1371/journal.pcbi.1007084
Handley, K. M. (2019). Determining Microbial Roles in Ecosystem Function: Redefining Microbial Food Webs and Transcending Kingdom Barriers. MSystems, 4(3). https://doi.org/10.1128/msystems.00153-19
Akdemir, D. (2013). Locally epistatic genomic relationship matrices for genomic association, prediction and selection. arXiv.org. Retrieved 18 September 2019, from https://arxiv.org/abs/1302.3463
Hoadley, E. (2011). Joint and individual variation explained (JIVE) for integrated analysis of multiple data types. ArXiv E-Prints, arXiv:1102.4110. Retrieved from https://ui.adsabs.harvard.edu/abs/2011arXiv1102.4110L/abstract
Wikipedia Contributors. (2019, October 23). Sustainable Development Goals. Retrieved October 26, 2019, from Wikipedia website: https://en.wikipedia.org/wiki/Sustainable_Development_Goals
Deep Learning in Medical Image Analysis. (2019). @AnnualReviews. Retrieved 7 October 2019, from https://www.annualreviews.org/doi/10.1146/annurev-bioeng-071516-044442
SDGs .:. Sustainable Development Knowledge Platform. (2015). Retrieved October 26, 2019, from Un.org website: https://sustainabledevelopment.un.org/topics/sustainabledevelopmentgoals
Deep learning for genomics. (2018). Nature Genetics, 51(1), 1–1. doi:10.1038/s41588–018–0328–0
Xiong, M., & Ma, L. (2013). An Efficient Sufficient Dimension Reduction Method for Identifying Genetic Variants of Clinical Significance. arXiv.org. Retrieved 18 September 2019, from https://arxiv.org/abs/1301.3528
Kwak, G. H.-J., & Hui, P. (2019). DeepHealth: Deep Learning for Health Informatics. Retrieved October 28, 2019, from arXiv.org website: https://arxiv.org/abs/1909.00384
Dinalankara, W., & Bravo, H. (2013). Anomaly Classification with the Anti-Profile Support Vector Machine. arXiv.org. Retrieved 18 September 2019, from https://arxiv.org/abs/1301.3514