At Produvia, we produce intelligent software. We also write letters about artificial intelligence (AI) to founders, executives, and decision-makers from all industries. These letters are meant to inspire and motivate companies, government agencies, and countries on the topics of AI, machine learning, and deep learning technologies.
At Produvia, we believe that artificial intelligence technologies will fundamentally change how genomics, biotechnology, and life sciences startups and companies turn data into actionable insights.
Before we talk about artificial intelligence, it is important to understand the genomics industry first.
The global genomics industry is worth $16.4 Billion USD as of 2018 and is expected to reach $41.2 Billion USD by 2025. The genomics industry consists of genomic products and services. The genomic products are expected to dominate the market due to the recurrent use of instruments and reagents for genomics research and the rising number of research programs undertaken by government and private organizations. The genomics services include next-generation sequencing, core genomics, biomarker translations, and many others. 
According to AngelList, there are 160+ genomic; 4,228+ biotechnology; 9,893+ life sciences; and 4,893,827+ startups around the world [2-5]. In other words, genomic startups represent about 4 percent of the biotechnology industry, 2 percent of the life sciences industry, and 3/1000 percent of all startups.
Today, the genomics industry is booming thanks to the increasing amount of data. Genomics data in the next 10 years is projected to equal and surpass other data-intensive disciplines including social media and online videos. 
Artificial Intelligence in Genomics
At Produvia, we predict that genomic startups that combine deep learning, computer vision, and natural language processing technologies will establish a competitive edge in the marketplace.
Deep learning, a sub-field of artificial intelligence, is combined with computer vision techniques to analyze the growing amount of genomics imagery data. In computer vision, deep learning algorithms that excel include convolutional neural networks and recurrent neural networks. These machine learning models are solving computer vision tasks such as image classification, semantic segmentation, and image retrieval.
Deep learning is also combined with natural language processing techniques to analyze the expanding amount of genomics-related text found in publically-available research papers. Deep neural networks are solving tasks such as named entity recognition, relation extraction, and information retrieval. Deep learning technologies are ideally suited to deal with natural language processing tasks since they offer state-of-the-art performance and overcome challenges with feature engineering.
At Produvia, we recognize the complexity of artificial intelligence in its applications in the genomics industry. As a result, we wrote this article as a guide for any stakeholders including patients, research participants, public, providers, researchers, advocacy groups, payers, and policymakers.
How AI and Genomics Will Save The Planet
In 2015, the United Nations (UN) set seventeen Global Goals, also known as Sustainable Development Goals (SDGs). The SDGs were adopted by all UN Member States as a universal call to action end poverty, protect the planet and ensure that all people enjoy peace and prosperity by 2030. 
Of the seventeen SDGs, the Produvia team identified five goals that can be solved with genomics and artificial intelligence technologies.
AI Goal #1: No Poverty
Can we really end poverty? Can we grow the middle class? These are really hard questions to answer. Satellite imagery was combined with machine learning to predict poverty . Poverty has been linked to disease, chronic illness, childhood obesity, elevated blood lead levels, academic achievements, and DNA methylation [8-12]. How can machine learning help with these genomic causations or correlations? If we can predict disease or DNA methylation across genes, we can take preventative action in the fight against poverty.
AI Goal #2: Zero Hunger
How can humanity end hunger? Can we achieve a stable food supply? Can we end hidden hunger, also known as micronutrient deficiency? Certain hormones that regulate hunger and satiety . Hunger can be detected in crying infants using deep learning . Analyzing how people eat or their consumption patterns can reveal hidden hunger or gaps in micronutrient deficiency. Can people improve nutrition and promote sustainable agriculture? To answer these questions, consider that plant breeding and other agricultural technologies are greatly improved using machine learning. Increasing crop yield production will close the gap between crop output and hunger. Genetically improving cultivars and improving agronomic practices is one way to increase crop productivity . If we make agricultural more productive, we can reduce world hunger.
AI Goal #3: Good Health and Well-Being
Can we live a healthier life? Can we promote the well-being of all humanity? Better detection of AIDS, tuberculosis, malaria and neglected tropical diseases are now possible thanks to deep learning. Imagine being able to create personalized genomic profiles of each person on earth. This will allow us to predict the outbreak of diseases knowing where the susceptibility lies. Humanity has the potential to edit human reproduction. With gene editing, we can create the next generation of humans, which are immune to the latest diseases and typical health conditions. Combing gene editing with machine learning will allow humanity to achieve customized genetic and genomic profiles of individuals. If we can better understand how the aging process affects health and longevity, we can create healthier societies. Today, we can use deep learning to detect changes in biomarkers (i.e., physiological variables, composite indices) using data from longitudinal studies.
AI Goal #4: Life Below Water
Can we conserve ocean life? Can humanity use the oceans, seas and marine resources sustainably? Genomics and machine learning can solve many problems to ensure the continuation of life below water. For example, we can classify ocean acidity to reduce declining fish stock. We can apply conservation genomics with deep learning technologies, to predict the biodiversity of living organisms. Can we improve our aquaculture? Over the past few decades, advancements in agricultural biotechnology have changed the way research is analyzed. Today, genomic data using is analyzed using a variety of computational tools including machine learning or deep learning.
AI Goal #5: Life on Land
Can we protect our ecosystem? Can we restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests, combat desertification? Lastly, can humanity halt and reverse land degradation and halt biodiversity loss? Understanding complex ecosystems and how genes are affected by the environment is possible thanks to machine learning technologies. Deep learning can meet genome-scale metabolic modeling . Machine learning technologies have demonstrated the ability to analyze large, complex biological data. Furthermore, the massive and rapid advancements in both biological data generation and machine learning methodologies are promising for further understanding of genomics and biological data. It’s now possible to classify microbial roles in ecosystems using deep learning . Genomic tools, such as population genomics, meta-omics, and genome editing, can also restore ecosystems and biodiversity. Meta-omics can improve the assessment and monitoring of restoration outcomes. Gene editing can generate novel genotypes for restoring challenging environments. Using machine learning to analyze population genomics, meta-omics, and genome editing data will aid companies in developing solutions to improve life on earth.
AI Research in Genomics
Artificial intelligence research is driving technological breakthroughs all industry verticals, genomics included. Reading academic papers takes time and the technical language is not easy to understand. At Produvia, on the other hand, we keep up-to-date with the latest academic research papers so you don’t have to. Below, we highlight 20 AI and machine learning use cases for genomics [18-26]:
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. Here are five AI and machine learning applications for genomics:
- Extract genomic and epigenomic variants of clinical utility
- Identify genes
- Predict genomic associations
- Predict protein functions
- Predict sequence the specificity of DNA and RNA-binding proteins
Regulatory genomics is the study of genomic regions or features and how they regulate genes. At Produvia, we list five AI and machine learning applications for regulatory genomics:
- Classify gene expression
- Predict gene expression from genotype
- Predict promoters and enhancers
- Predict splicing
- Predict transcription factors and RNA-binding proteins
The field of molecular biology that attempts to describe gene functions and interactions is functional genomics. Here are five AI applications for functional genomics:
- Classify mutations and functional activities
- Classify subcellular localization
- Predict promoters and enhancers
- Predict splicing
- Predict transcription factors and RNA-binding proteins
Structural genomics is the field of genomics that involves the characterization of genome structures. At Produvia, we list five AI and machine learning applications for structural genomics:
- Classify protein tertiary structures
- Classify structures of proteins
- Predict contact maps
- Predict physical properties
- Predict protein secondary structures
AI Ideas for Genomics
You’re interested in artificial intelligence and machine learning, but don’t know where to start. At Produvia, we brainstormed several ideas for the application of artificial intelligence technologies in genomics. Here are thirty-five AI ideas for genomics:
- Annotate genes based on structure and chromosomes
- Classify cancer from gene expression profiles
- Classify genes
- Classify genomic profiles
- Classify mutation types
- Design targetted therapies
- Detect deoxyribonucleic acid regions that are predictive of gene expression
- Determine relationships between genotypes and phenotype
- Discover drugs for genomic medicine
- Distinguish between cancer and adenoma
- Estimate prevalence for chromatin marks
- Extract transcriptome patterns
- Identify biomarkers for a disease
- Identify enhancers
- Identify pairwise variable associations between genomic data types
- Identify positioned nucleosomes
- Identify potentially valuable disease biomarkers
- Identify promoters
- Identify subtype of breast cancer tumor
- Identify transcription factor binding sites
- Identify transcription start sites, splice sites, exons
- Interpret regulatory control in single cells
- Model regulatory elements
- Partition and label the genome with chromatin state annotation
- Predict chromatin marks from deoxyribonucleic acid sequences
- Predict disease phenotype or prognosis
- Predict gene function
- Predict genetic interactions
- Predict protein backbones from protein sequences
- Predict regulatory functions and relationships
- Predict sequence the specificity of enhancer and cis-regulatory regions
- Predict the specificities of deoxyribonucleic acid-binding and ribonucleic acid-binding proteins
- Predict the splicing activity of individual exons
- Predict variant deleteriousness
- Quantify effects of single nucleotide variants on chromatin accessibility
Challenges and Opportunities in Genomics
The use of artificial intelligence technologies to solve genomics problems poses many challenges. These industry challenges also present opportunities for AI technology providers, such as Produvia, to solve market problems and create AI solutions. Below, we list three genomics opportunities:
- Generating ground-truth labels or genomics datasets can be expensive
- “Right to an explanation” laws must be addressed
- Longitudinal studies are required
How can AI companies overcome these challenges? At Produvia, we believe that industry collaboration will overcome Challenge #1, algorithmic transparency will overcome Challenge #2, and long-term research projects will overcome Challenge #3.
The combination of artificial intelligence technologies and genomics has the potential to end poverty, end hunger, protect, restore and promote aquatic and terrestrial ecosystems.
Interested to solve genomics problems?
Tell us about your project and a member of our team will get back to you. Get started!
This post originally appeared on Medium on October 28, 2019.
- Research, Z. (2019). Global Genomics Market Will Reach USD 41.2 Billion By 2025: Zion Market Research. GlobeNewswire News Room. Retrieved 1 September 2019, from https://www.globenewswire.com/news-release/2019/04/10/1801776/0/en/Global-Genomics-Market-Will-Reach-USD-41-2-Billion-By-2025-Zion-Market-Research.html
- Genomics Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website: https://angel.co/genomics-2
- Biotechnology Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website: https://angel.co/biotechnology
- Life Sciences Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website: https://angel.co/life-sciences
- All Startups Startups. (2019, October 26). Retrieved October 26, 2019, from AngelList website:https://angel.co/all-markets
- dpicampaigns. (2018). About the Sustainable Development Goals — United Nations Sustainable Development. Retrieved October 26, 2019, from United Nations Sustainable Development website: https://www.un.org/sustainabledevelopment/sustainable-development-goals/
- Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794. https://doi.org/10.1126/science.aaf7894
- Global genomics disparities in the wake of personalised medical services: International Journal of Medical Engineering and Informatics: Vol 1, No 4. (2009). Retrieved October 27, 2019, from International Journal of Medical Engineering and Informatics website: https://www.inderscienceonline.com/doi/abs/10.1504/IJMEI.2009.026812
- Newacheck, P. W. (1994). Poverty and Childhood Chronic Illness. Archives of Pediatrics & Adolescent Medicine, 148(11), 1143. https://doi.org/10.1001/archpedi.1994.02170110029005
- Chokshi, D. A. (2018). Income, Poverty, and Health Inequality. JAMA, 319(13), 1312. https://doi.org/10.1001/jama.2018.2521
- Wexler, B. E., Imal, Ahmet Esat, Pittman, B., & Bell, M. D. (2019). Executive Function Deficits Mediate Effects of Poverty on Academic Achievement: An Important Target for Interventions to Enhance Neurocognitive Development in At-Risk Children. Retrieved October 27, 2019, from Ssrn.com website: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3369774
- Poverty leaves a mark on our genes. (2019). Retrieved October 26, 2019, from Northwestern.edu website: https://news.northwestern.edu/stories/2019/04/poverty-leaves-a-mark-on-our-genes/
- Mesmar, B., & Steinle, N. (2020). Genomics of Eating Behavior and Appetite Regulation. Principles of Nutrigenetics and Nutrigenomics, 159–165. https://doi.org/10.1016/b978-0-12-804572-5.00020-3
- Barajas-Montiel, S. E., & Reyes-Garcia, C. A. (2019). Identifying Pain and Hunger in Infant Cry with Classifiers Ensembles. International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC’06). https://doi.org/10.1109/cimca.2005.1631561
- Borrill, P., Harrington, S. A., & Uauy, C. (2018). Applying the latest advances in genomics and phenomics for trait discovery in polyploid wheat. The Plant Journal. https://doi.org/10.1111/tpj.14150
- Zampieri, G., Vijayakumar, S., Yaneske, E., & Angione, C. (2019). Machine and deep learning meet genome-scale metabolic modeling. PLOS Computational Biology, 15(7), e1007084. https://doi.org/10.1371/journal.pcbi.1007084
- Handley, K. M. (2019). Determining Microbial Roles in Ecosystem Function: Redefining Microbial Food Webs and Transcending Kingdom Barriers. MSystems, 4(3). https://doi.org/10.1128/msystems.00153-19
- Akdemir, D. (2013). Locally epistatic genomic relationship matrices for genomic association, prediction and selection. arXiv.org. Retrieved 18 September 2019, from https://arxiv.org/abs/1302.3463
- Hoadley, E. (2011). Joint and individual variation explained (JIVE) for integrated analysis of multiple data types. ArXiv E-Prints, arXiv:1102.4110. Retrieved from https://ui.adsabs.harvard.edu/abs/2011arXiv1102.4110L/abstract
- Wikipedia Contributors. (2019, October 23). Sustainable Development Goals. Retrieved October 26, 2019, from Wikipedia website: https://en.wikipedia.org/wiki/Sustainable_Development_Goals
- Deep Learning in Medical Image Analysis. (2019). @AnnualReviews. Retrieved 7 October 2019, from https://www.annualreviews.org/doi/10.1146/annurev-bioeng-071516-044442
- SDGs .:. Sustainable Development Knowledge Platform. (2015). Retrieved October 26, 2019, from Un.org website: https://sustainabledevelopment.un.org/topics/sustainabledevelopmentgoals
- Deep learning for genomics. (2018). Nature Genetics, 51(1), 1–1. doi:10.1038/s41588–018–0328–0
- Xiong, M., & Ma, L. (2013). An Efficient Sufficient Dimension Reduction Method for Identifying Genetic Variants of Clinical Significance. arXiv.org. Retrieved 18 September 2019, from https://arxiv.org/abs/1301.3528
- Kwak, G. H.-J., & Hui, P. (2019). DeepHealth: Deep Learning for Health Informatics. Retrieved October 28, 2019, from arXiv.org website: https://arxiv.org/abs/1909.00384
- Dinalankara, W., & Bravo, H. (2013). Anomaly Classification with the Anti-Profile Support Vector Machine. arXiv.org. Retrieved 18 September 2019, from https://arxiv.org/abs/1301.3514