Statistics is fun because there many paths. Most studies using the microbiome uses the easy, but naïve, path of computing averages and standard deviation. As my dataset has grown, I have been travelling some less traveled path, for example: Visual Exploration of Odds Ratios, and a patent pending method termed “Kaltoft-Moltrup”.
One of the frequent decisions that I see in studies is to limit examination of bacteria that have a high frequency in the samples. This allows the researchers to keep to familiar and classic statistics. Using frequency of observation in the control group and the condition group is one of these much less travelled paths. It usually require big sample sizes and many studies have a sample size of 30 (sufficient for the mean and standard deviation approach).
I just completed code to compute Chi2 using Biomesight data for users reporting Autism.
- Control Population: 3525
- Autism: 88
Chi2 can be converted to probability (p) of happening at random with the following table
Seen too Rarely(Want to increase)
We see one bacteria available as a probiotic Bifidobacterium adolescentis. The rest would need to be altered by diet.
tax_name | TAX_RANK | Chi2 | Observed | Expected | Shift |
Butyricimonas synergistica | species | 10 | 17 | 36 | Under-Represented |
Bifidobacterium adolescentis JCM 15918 | strain | 9.8 | 9 | 24 | Under-Represented |
Dehalobacterium | genus | 9.7 | 18 | 37 | Under-Represented |
Pelotomaculum isophthalicicum | species | 8 | 17 | 33 | Under-Represented |
Ammonifex thiophilus | species | 7.3 | 17 | 32 | Under-Represented |
Seen too Often (Want to decrease)
We see 32 bacteria over a Chi2 of 6.635 ( P < 0.01 or 1 change in 100 of being a false detection). One very striking feature is that there are many, many different species of Bifidobacterium that are over represented while one species is under represented. This is not a simple situation to address.
tax_name | TAX_RANK | Chi2 | Observed | Expected | Shift |
Bifidobacterium catenulatum subsp. kashiwanohense | subspecies | 43.5 | 54 | 23 | Over-Represented |
Bifidobacterium angulatum | species | 33 | 39 | 16 | Over-Represented |
Staphylococcus pseudolugdunensis | species | 23.6 | 20 | 7 | Over-Represented |
Clostridium cellulovorans | species | 22.9 | 20 | 7 | Over-Represented |
Bifidobacterium catenulatum PV20-2 | strain | 19.5 | 58 | 33 | Over-Represented |
Streptococcus mutans | species | 19 | 23 | 10 | Over-Represented |
Hungateiclostridium | genus | 18.3 | 30 | 14 | Over-Represented |
Hungateiclostridiaceae | family | 18.1 | 30 | 14 | Over-Represented |
Streptococcus intermedius | species | 17.5 | 23 | 10 | Over-Represented |
Bifidobacterium catenulatum | species | 16.9 | 58 | 34 | Over-Represented |
Absiella | genus | 16.7 | 22 | 10 | Over-Represented |
Clostridium chartatabidum | species | 16.7 | 43 | 23 | Over-Represented |
Bifidobacterium gallicum | species | 15.4 | 73 | 47 | Over-Represented |
Prevotella veroralis | species | 13 | 15 | 6 | Over-Represented |
Corynebacterium durum | species | 12.6 | 14 | 6 | Over-Represented |
Bifidobacterium thermacidophilum | species | 11.3 | 18 | 8 | Over-Represented |
Parascardovia | genus | 10.5 | 32 | 18 | Over-Represented |
Klebsiella oxytoca | species | 10.5 | 27 | 15 | Over-Represented |
Bifidobacterium scardovii | species | 10 | 30 | 17 | Over-Represented |
Bifidobacterium cuniculi | species | 9.9 | 31 | 18 | Over-Represented |
Candidatus Blochmanniella camponoti | species | 9.8 | 21 | 11 | Over-Represented |
Abiotrophia | genus | 9.2 | 12 | 5 | Over-Represented |
Enterococcus gilvus | species | 9.1 | 14 | 6 | Over-Represented |
Megamonas funiformis | species | 8.8 | 21 | 11 | Over-Represented |
Segatella oulorum | species | 8.6 | 22 | 12 | Over-Represented |
Ralstonia | genus | 7.9 | 14 | 7 | Over-Represented |
Bifidobacterium indicum | species | 7.7 | 67 | 48 | Over-Represented |
Candidatus Blochmanniella | genus | 7.2 | 35 | 22 | Over-Represented |
ant endosymbionts | clade | 7.2 | 35 | 22 | Over-Represented |
unclassified Bacteroidetes Order II. | order | 7.2 | 75 | 55 | Over-Represented |
Enterobacter hormaechei | species | 6.9 | 36 | 23 | Over-Represented |
Moorella group | norank | 6.7 | 66 | 48 | Over-Represented |
Bottom Line
The next step is to compute similar tables for all symptoms and incorporate these findings into a new algorithm. I say new, because I do not know if it is better than the existing ones. Conceptually, it would be added as a 5th set of suggestions to the existing consensus view on Microbiome Prescription.
Recent Comments