The Customer

Nitto BioPharma, Inc. based in San Diego, develops and delivers innovative life-transforming therapies for patients’ unmet medical needs, and accelerates the ability to bring these products to market.  Nitto BioPharma is a division of Nitto Denko Corporation, based in Osaka, Japan.

Customer Challenge

siRNA silencing is considered one of the most promising techniques in future therapy for viral-mediated and gene-mediated disease, such as HIV, HBV and cancer.  The key to this technique is siRNA’s inhibition efficiency prediction and proper siRNA selection. AI Dynamics worked with Nitto BioPharma to white label a BLAST integration solution built on the company’s NeoPulse® end-to-end enterprise AI platform for experts and non-experts alike. 


Nitto BioPharma came to AI Dynamics requesting a tool to extract 19 base pair length potential siRNA sequences from a provided target protein sequence and rank them in order of inhibition value. After receiving the inhibition value, Nitto BioPharma wants to submit siRNA sequences with the highest inhibition values to BLAST (Basic Local Alignment Search Tool), which finds regions of similarity between biological sequences. It compares nucleotides to sequence databases and calculates the statistical significance.

Solution

The NeoPulse platform takes a FASTA sequence (a text-based format for representing nucleotide sequences) or target name as input and returns potential siRNAs with the highest inhibition in order.  A deep learning regression model is applied to predict inhibition of a 19-sequence siRNA from its nucleic acid sequence. The model is built based on the state-of-the-art NLP algorithm, Transformer, which significantly improves the model performance. Transfer learning is applied in the training process, utilizing the feature extracted from more than 2 million human RNA sequences.

AI Dynamics created a model to predict the inhibition of a 19-sequence siRNA from its nucleic acid sequence. The project call queried API to analyze the imported nitto_reg to gain inhibition values. To obtain the full FASTA sequences (the text-based format for representing nucleotide sequences using single-letter codes) and their annotations, AI Dynamics used the Entrez API (to gain access to the Entrez molecular biology database system that provides integrated access to nucleotide sequence data) and GeneNames API (to gain access to the database of the HUGO Gene Nomenclature Committee (HGNC), which is responsible for approving gene names and symbols for every known human gene).

To obtain BLAST results, Nitto BioPharma used the NCBI Blast API to gain access to the suite of programs used to generate alignments between a nucleotide or protein sequence, referred to as a “query” and nucleotide or protein sequences within a database, referred to as “subject” sequences. 

NeoPulse then took the full FASTA sequence and broke it into 19-digit siRNA sequences.  The result table includes sequence and value columns and can be exported to a .csv file. NeoPulse then sent a BLAST API request with the sequences that have inhibition values above the “Minimum Predicted Inhibition Value” input. In the BLAST result table, the user can get the inhibition value of the sequences that have a partial match to the original sequence.

Results

The result of our solution achieved R=0.85 and AUC=0.93. It outperformed other published siRNA prediction models such as, BIOPREDsi (Novartis model; Huesken et al., 2005), MysiRNA (Mysara et al., 2012), and SMEpred (Dar et al., 2016), which reported R=0.66, 0.70 and 0.72, respectively. The higher accuracy helps in designing siRNA sequences more efficiently, reducing the time and cost of screening in the lab.

White-labeled solution (BLAST integration) built on top of NeoPulse.

Customer can integrate new data into model through automated retraining

siRNA efficiency poster1 1022x734
siRNA efficiency poster2 814 x 488

Reference

Huesken et al., (2005) Design of a genome-wide siRNA library using an artificial neural network. Nat. Biotechnol. 23(8):995-1001.

Mysara et al., (2012) MysiRNA: improving siRNA efficacy prediction using a machine-learning model combining multi-tools and whole stacking energy (ΔG). J Biomed Inform. 45(3):528-34.

Dar et al., (2016) SMEpred workbench: A web server for predicting efficacy of chemically modified siRNAs. RNA Biol. 13(11):1144-1151.

Hyundai Elevator color logo
Seong-jin Kim
Hyundai Elevator
CDO

We were able to successfully train an AI model to recognize complex industrial parts using Neopulse 3.0 on AWS. The AI solution was built very quickly and was able to recognize objects in unpredictable, real-world environments with high accuracy.

Megazone logo - AI Dynamics Testimonials
Serena Miran
Megazone Cloud
Manager

Everybody seems confident in AI, and they actually enjoy solving various AI problems.

Dell EMC small logo - AI Dynamics Testimonials
Balachandran Rajendran
Dell, Unstructured Data Solutions
CTO

The vendor’s services are integral to providing AI solutions for a wider audience. They had an effective project management style, accented by a quick working style.

Read up on our latest blogs in the field of Bioinformatics

TEST
DRIVE
TODAY

3O-DAY
FREE TRIAL

Privacy Preferences
When you visit our website, it may store information through your browser from specific services, usually in form of cookies. Here you can change your privacy preferences. Please note that blocking some types of cookies may impact your experience on our website and the services we offer.