NIRD: Network Inference by Reduced Dimensions

Frequently Asked Questions (FAQs)

1. What are gene regulatory networks (GRNs)? Why are they important?

Answer: Gene Regulatory Networks (GRNs) represent how genes interact with and regulate each other, often through transcription factors. They help understand cellular functions, disease mechanisms, and biological responses, making them crucial for systems biology and precision medicine.

2. What is matrix factorization (MF) and how can it help in gene regulatory inference?

Answer: Matrix Factorization (MF) is a technique that breaks down a high-dimensional gene expression matrix into low-rank components. It helps uncover hidden patterns and relationships between genes, enabling more accurate and scalable inference of gene regulatory networks.

3. How is matrix factorization better than other dimension reduction techniques like PCA and t-SNE?

Answer: Unlike PCA and t-SNE, which focus on variance or visualization, MF emphasizes reconstructing meaningful latent factors. It offers better interpretability for regulatory inference, especially in noisy and sparse biological data like single-cell RNA-seq.

4. How NIRD works?

Answer: NIRD reduces high-dimensional gene expression data into a low-dimensional space using matrix factorization. It then applies a Conditional Random Forest model to rank features based on their importance. By projecting these rankings back to the original space, NIRD estimates true feature contributions. These are used to reconstruct a gene regulatory network that reflects the most likely biological dependencies among genes.

5. Which methods are offered by NIRD?

Answer: NIRD provides a diverse set of matrix factorization techniques to capture various biological signals and network structures. The supported methods include:

These methods offer flexibility to tailor the dimensionality reduction process based on the nature of your dataset and the specific inference goals.

6. What types of datasets are supported?

Answer: NIRD supports variety of datasets like bulk and single-cell RNA-seq data, time course data, transcription velocity data and categorical data like mutations.

7. Can I use my own factorization method?

Answer: Yes. NIRD is modular and supports custom methods.

8. How GRNs are constructed from reduced data?

Answer: After dimensionality reduction, NIRD models the influence of regulatory genes (like transcription factors) on other genes using machine learning methods, typically random forests. These models operate in the low-dimensional space to capture non-linear dependencies. Feature importance scores from the regressors are combined with matrix factorization weights to estimate gene-gene regulatory links. This results in a weighted gene regulatory network that reflects the likelihood of biological interactions, with improved signal clarity and flexibility for downstream analysis.

9. How are inferred networks evaluated?

Answer: NIRD evaluates inferred gene regulatory networks (GRNs) using an edge-overlapping approach that compares ranked gene-gene interactions across networks. It calculates how many top-ranked edges from one network are shared with another, building an edge-overlap curve. The area under this curve (AUC) quantifies similarity—higher AUC means greater agreement between networks. This method ensures the inferred GRNs are consistent, stable, and biologically meaningful across datasets and conditions.

10. How to select the best suitable method for a particular dataset?

Answer: To choose the most suitable matrix factorization method for your dataset in NIRD, you can follow two key approaches:

By combining these evaluations, you can systematically select the best-performing method tailored to your specific data type and experimental condition.