BioinformaticsAims

Genome-wide CRISPR screening is a rapidly evolving field, with conflicting bioinformatics analysis programs now available. There is a distinct lack of experimental validation of the prioritisation of hits called from these distinct programs, making the choice of analysis tools by researchers arbitrary. Therefore, the aims of this project are:

  1. To develop an optimised bioinformatic pipeline based on publicly available algorithms for the identification of top gene hits from pooled genome-wide CRISPR screening data.
  2. To develop an adaptable arrayed high-content imaging-based CRISPR knockout assay for the verification of top gene hits from genome-wide CRISPR knockout screens.

Brief project outline

We have recently developed a method for pooled genome-wide CRISPR screens for the identification of genes involved in diverse cellular phenotypes in collaboration with the GIH (Dr Jon Xu, main contributor). Unexpectedly, we found that analysis of the raw sgRNA counts using multiple different publicly available algorithms resulted in the generation of distinct hit lists with very little overlap in top priority genes. These findings raise doubt about the singular biased approaches that are currently employed for the analysis of these datasets and indicate that information may be lost or masked due to the researchers’ choice of analysis program. We will therefore, for the first time, conduct comparisons of the hit-calling rates of three different publicly available programs for the analysis of genome-wide CRISPR screens by feeding these data into an arrayed form of high content image-based CRISPR knockout for direct validation of gene hits. This work will guide the way in which CRISPR screening datasets are mined and analysed for the detection of biologically relevant findings in future studies.

Genomics-based innovative aspect of proposal 

Innovation. The genomic-based innovative aspects of this project are three-fold. We will:

1.Employ the TransEdit library, which is designed such that each gene is targeted by a maximum of 6 sgRNA that are paired over 3 constructs for effective gene knockdown in an arrayed format.

2.Develop and refine a bioinformatics pipeline for the analysis of genome-wide CRISPR screens either by determining the best algorithm with the highest hit-calling rate or by using multiple algorithms to analyse the data as standard practice.

3.Apply innovative hierarchical clustering analysis to determine the relationship between genotype and cellular phenotype to validate and rank genes of interest, and establish the true hit-calling rate of each algorithm.

Existing GIH capabilities. To execute this study, we would build upon the bioinformatic skills that were developed by Jon Xu in our 2019-2020 GIH genome-wide CRISPR screening project. The proposed project will provide evidence-based data regarding the true hit- calling rates of each of the algorithms. With this knowledge, we will be able to refine and build a framework for best-practice analysis of future genome-wide CRISPR screens.

New capabilities required. High content imaging using the Perkin Elmer Operetta and the knowledge to conduct hierarchical clustering of top hits will be arranged with our collaborators, Shyuan Ngo and Kaylene Simpson, respectively.

Broad applicability of the technique

The refined bioinformatic strategy for identifying top priority genes from genome-wide CRISPR screens could be adopted immediately by other labs in UQ, and indeed other labs internationally once published. Once established, the platform for arrayed CRISPR knockout screening and high content imaging could be adopted by other labs within a short timeframe, including prioritisation of hits, design of library and randomisation of genes on plate maps, testing, imaging, and analysing. The arrayed CRISPR with high content imaging technique is applicable to any biological question where manipulation of genes results in a quantifiable cellular phenotype (morphology, cell death or division, fluorescence, reporter gene expression, subcellular localisation, punctate structures etc.). There is also the possibility of extending this technique to primary cultures and patient-derived induced pluripotent stem cells. We will be willing to help and work with other groups as they adapt the pipeline that we have developed to their own studies. This technology is highly versatile and it will applicable to answer many biological questions of interest to UQ researchers.

 

Project members


Research collaborators

Dr Rebecca San Gil

Dr Rebecca San Gil

FightMND Research Fellow
Queensland Brain Institute
Dr Adam Walker

Dr Adam Walker

Ross Maclean Fellow
Queensland Brain Institute

Genome Innovation Hub

Jun Xu

Jun Xu

Computational biologist
Former GIH staff
Dr Sohye Yoon

Dr Sohye Yoon

Research Specialist - Genomics
Genome innovation Hub
Dr Jun Ma

Dr Jun Ma

Research Specialist - Biochemistry
Genome Innovation Hub