Bioinformatics questions and answers

Bioinformatics

Mini-bulk RNA-seq and scRNA-seq analysis differences?

We have RNA-seq data from libraries prepared using the Smartseq2 single-cell protocol on 500 cells (mini-bulk) / library. The complexity is much better than with single-cells (~14k genes for 1.5M reads). 6 cell types / biological replicates were collected with flow cytometry, and there are 4 control and 5 diseased ....

Austin Austin: yesterday

Bioinformatics

Why do ten rows (Figure_1) correspond to 2 bits (Figure_2) in a sequence logo?

Following this question (https://bioinformatics.stackexchange.com/questions/9087/what-is-aligned-sequences-and-consensus-sequence-in-the-context-of-sequence), I'm confused with the computation of sequence logo (https://en.wikipedia.org/wiki/Sequence_logo) Following data comes from the book "Machine Learning - A Probabilistic Perspective (http://people.cs.ubc.ca/~murphyk/MLbook/) (Figure_1)" here is the corresponding sequence logo (https://bioinformatics.stackexchange.com/a/9089/4962) (Figure_2). ten rows represent sequences of DNA (e.g. row 1 could be a human sequence, row ....

Henry Henry: yesterday

Bioinformatics

Assembly reads having a copy of their beginning in their tail

I am analyzing the reads for the SARS-CoV-2 assembly with id SRR11140748. Apparently these reads were obtained with parallel sequencing by Illumina and Oxford Nanopore Technologies. I have found these reads GGTAAGTATGTACAAATACCTACAACTTGTGCTAATGACCCTGTGGGTTTTACACTTAAAAACACAGTCTGTACCGTCTGCGGTATGTGGAAAGGTTATGGCTGTAGTTGTGATCAACTCCGCGAACCCATGCTTCAGTCAGCTGATGCACAATCGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCACAGGCACTAGTACTGATGTCGTATACAGGGCTTTTGACATCAGTATGTACAAATACCTACAACTTGTGCTAATGACCCTGTGGGTTTTACACTTAAAAACACAGTCTGTACCGTCTGCGGTATGTGGAAAGGTTATGGCTGTAGTTGTGATCAACTCCGCGAACCCATGCTTCAGTCAGCTGATGCACAATCGTTTTTAAACGAGTATGTACAAATACCTACAACTTGTGCT As you can see, the second one has an exact copy at the tail of its beginning: AGTATGTACAAATACCTACAACTTGTGCT ....

Aria Aria: 2 days ago

Bioinformatics

Using BLAT command line tool to blat split sequences

I have the nucleotide sequence: AATTGAGGCACATTTTTTTTTAGACAGTCTTGCTCTGTTGCCCAGGCTGGAGTGCAGTGGTGTGATCATAGCTCACTGCAGCCTCGACCTCCTGGGCTCAACAAAGCACACAGTGGGCGGATCCCCACCAG When I blat this on UCSC Genome Browser, the first hit is a match which spans 6572 basepairs on chromosome 19, and essentially matches the first portion of the sequence to one spot on the genome, and the second portion to another part of ....

Harper Harper: 2 days ago

Bioinformatics

Prediction of prokaryotic origins of replication (ORI)

I want to predict origins of replication (ORI) on hundreds of prokaryotic genomes. The most straight-forward solution would be to use most commonly used tool, Ori-Finder (https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-79). It uses integrated gene prediction, analysis of base composition asymmetry, distribution of DnaA boxes, occurrence of genes frequently close to oriC regions and ....

Ariana Ariana: 3 days ago

Bioinformatics

Can I change the name of file features.tsv to genes.tsv

it is said that the features.tsv from Cell Ranger v3 is analogous to the genes.tsv from Cell Ranger v2. So can I change the file name to genes.tsv for Seurat to read it? I know that Seurat has the update version 3 that can read the features.tsv, and I would ....

Amelia Amelia: 4 days ago

Bioinformatics

Does BLASTClust guarantee that proteins in different clusters are dissimilar?

I need to find dissimilar proteins. Looking through the PDB I found the weekly BLASTClust results of proteins that are 30% similar. However, I do not know if protein A in cluster 1 is guaranteed to be less than 30% similar to protein B in cluster 2. The best and ....

Carter Carter: 5 days ago

Bioinformatics

How to resolve in snakemake error : "Target rules may not contain wildcards."

I would like to do easily reproducible analysis using publicly available data from NCBI, so I have chosen a snakemake. I would like to write a single rule, that would be able to download any genome given a species code name and separated table of species and their NCBI IDs. ....

Piper Piper: 5 days ago

Bioinformatics

Real time transcript profiles

Do any methods exist (or are in the process of development) for investigating transcript data without lysing the cells, i.e, destroying the sample? ....

Oliver Oliver: 5 days ago

Bioinformatics

How can I specify to DEseq2 to only perform comparisons on which I am interested with?

I am currently performing a large RNA-seq analysis from mice PBMCs. The dataset contains around 6,000 transcriptomic profiles and I would like to use DESeq2 to identify the sets of differentially expressed genes in the different conditions. In total, I have 100 biological stimulations, and for each stimulation I have ....

Maria Maria: 6 days ago