Statistically Significant Bacteria shifts seen in Autism

Statistics is fun because there many paths. Most studies using the microbiome uses the easy, but naïve, path of computing averages and standard deviation. As my dataset has grown, I have been travelling some less traveled path, for example: Visual Exploration of Odds Ratios, and a patent pending method termed “Kaltoft-Moltrup”.

One of the frequent decisions that I see in studies is to limit examination of bacteria that have a high frequency in the samples. This allows the researchers to keep to familiar and classic statistics. Using frequency of observation in the control group and the condition group is one of these much less travelled paths. It usually require big sample sizes and many studies have a sample size of 30 (sufficient for the mean and standard deviation approach).

I just completed code to compute Chi2 using Biomesight data for users reporting Autism.

  • Control Population: 3525
  • Autism: 88

Chi2 can be converted to probability (p) of happening at random with the following table

Seen too Rarely(Want to increase)

We see one bacteria available as a probiotic Bifidobacterium adolescentis. The rest would need to be altered by diet.

tax_nameTAX_RANKChi2ObservedExpectedShift
Butyricimonas synergisticaspecies101736Under-Represented
Bifidobacterium adolescentis JCM 15918strain9.8924Under-Represented
Dehalobacteriumgenus9.71837Under-Represented
Pelotomaculum isophthalicicumspecies81733Under-Represented
Ammonifex thiophilusspecies7.31732Under-Represented

Seen too Often (Want to decrease)

We see 32 bacteria over a Chi2 of 6.635 ( P < 0.01 or 1 change in 100 of being a false detection). One very striking feature is that there are many, many different species of Bifidobacterium that are over represented while one species is under represented. This is not a simple situation to address.

tax_nameTAX_RANKChi2ObservedExpectedShift
Bifidobacterium catenulatum subsp. kashiwanohensesubspecies43.55423Over-Represented
Bifidobacterium angulatumspecies333916Over-Represented
Staphylococcus pseudolugdunensisspecies23.6207Over-Represented
Clostridium cellulovoransspecies22.9207Over-Represented
Bifidobacterium catenulatum PV20-2strain19.55833Over-Represented
Streptococcus mutansspecies192310Over-Represented
Hungateiclostridiumgenus18.33014Over-Represented
Hungateiclostridiaceaefamily18.13014Over-Represented
Streptococcus intermediusspecies17.52310Over-Represented
Bifidobacterium catenulatumspecies16.95834Over-Represented
Absiellagenus16.72210Over-Represented
Clostridium chartatabidumspecies16.74323Over-Represented
Bifidobacterium gallicumspecies15.47347Over-Represented
Prevotella veroralisspecies13156Over-Represented
Corynebacterium durumspecies12.6146Over-Represented
Bifidobacterium thermacidophilumspecies11.3188Over-Represented
Parascardoviagenus10.53218Over-Represented
Klebsiella oxytocaspecies10.52715Over-Represented
Bifidobacterium scardoviispecies103017Over-Represented
Bifidobacterium cuniculispecies9.93118Over-Represented
Candidatus Blochmanniella camponotispecies9.82111Over-Represented
Abiotrophiagenus9.2125Over-Represented
Enterococcus gilvusspecies9.1146Over-Represented
Megamonas funiformisspecies8.82111Over-Represented
Segatella oulorumspecies8.62212Over-Represented
Ralstoniagenus7.9147Over-Represented
Bifidobacterium indicumspecies7.76748Over-Represented
Candidatus Blochmanniellagenus7.23522Over-Represented
ant endosymbiontsclade7.23522Over-Represented
unclassified Bacteroidetes Order II.order7.27555Over-Represented
Enterobacter hormaecheispecies6.93623Over-Represented
Moorella groupnorank6.76648Over-Represented

Bottom Line

The next step is to compute similar tables for all symptoms and incorporate these findings into a new algorithm. I say new, because I do not know if it is better than the existing ones. Conceptually, it would be added as a 5th set of suggestions to the existing consensus view on Microbiome Prescription.

Leave a Reply