Evaluation of Statistical Methods for Classification of Laser-Induced Breakdown Spectroscopy (LIBS) Data

dc.contributorGifford, Janice
dc.contributor.advisorKim, Ji Young
dc.contributor.advisorDyar, Darby
dc.contributor.authorDeVeaux, Michelle
dc.date.accessioned2012-07-17T13:01:00Z
dc.date.available2012-07-17T13:01:00Z
dc.date.gradyear2012en_US
dc.date.issued2012-07-17
dc.description.abstractWhen NASA’s Curiosity rover lands in August 2012, the rover will use a laser-induced breakdown spectroscopy (LIBS) instrument to collect data in an effort to understand the chemical composition and geological classification of the rocks on Mars. This is part of a larger endeavor to determine information about the planet’s habitability. LIBS is a method used to determine the elemental composition of a given sample. For each rock sample analyzed by the instrument, a LIBS spectrum consisting of over 6,000 different channels is obtained. In order to prepare for the return of LIBS data from the rover, this project aims to evaluate the accuracy of statistical methods, such as discriminant analysis, support vector machines, and clustering algorithms for categorizing the rock samples into groups with similar chemical compositions based on their LIBS spectra alone. Accurate classification is critical for rapid identification of similar unknown samples, novelty detection, and in the selection of a training set of data for use in the estimation of chemical compositions. Similar studies have been performed; however, they generally fail to use statistical best practices and therefore have wildly optimistic results. The data used in this project is from the “century set”, a suite of 100 igneous rock samples. These 100 samples are the only ones currently available for this project which have both LIBS spectra and known chemical compositions. Having the known chemical compositions allowed the century set samples to be divided into groups with geological similarities based on their Total Alkali-Silica (TAS) classes, and provided a way to evaluate the predictive accuracy of the classification algorithms using K-fold cross validation. The results show that the small sample size and uneven distribution of samples in different TAS classes make classification into many groups difficult, contradicting many of the outcomes displayed in the literature. However, some of the methods explored in this thesis do show promise based on their performance in simpler classification tasks, so the results should be reevaluated once more data is obtained. LIBS data is scarce, so this thesis also briefly explores the results from one method of simulating a LIBS spectrum based on the sample’s chemical composition. Simulated data could be used to examine the effects of sample size on the accuracies of the various classification algorithms.en_US
dc.description.sponsorshipMathematics & Statisticsen_US
dc.identifier.urihttp://hdl.handle.net/10166/3182
dc.language.isoen_USen_US
dc.rightsAttribution-ShareAlike 3.0 United States*
dc.rights.restrictedpublic
dc.rights.urihttp://creativecommons.org/licenses/by-sa/3.0/us/*
dc.subjectstatistical classificationen_US
dc.subjectclusteringen_US
dc.subjectspectroscopyen_US
dc.subjectLIBSen_US
dc.subjectmachine learningen_US
dc.subjectdiscriminant analysisen_US
dc.subjectsupport vector machineen_US
dc.subjectstatisticsen_US
dc.titleEvaluation of Statistical Methods for Classification of Laser-Induced Breakdown Spectroscopy (LIBS) Dataen_US
dc.typeThesis
mhc.degreeUndergraduateen_US
mhc.institutionMount Holyoke College

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
DeVeaux_Thesis.pdf
Size:
857.8 KB
Format:
Adobe Portable Document Format
Description:
Thesis full text
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.9 KB
Format:
Item-specific license agreed upon to submission
Description: