Feature subset selection problem on microarray data

Warning The system is temporarily closed to updates for reporting purpose.

Özşamlı, Nihan (2009) Feature subset selection problem on microarray data. [Thesis]

[thumbnail of NihanOzsamli.pdf] PDF
NihanOzsamli.pdf

Download (756kB)

Abstract

Recent advance of technology gave birth to tools such as microarray chips. The use of microarray chips enabled the scientists to measure the amount of protein production from their genes in a cell, known as the gene expression data. The classification of cell samples by means of their gene expression data is a hot research area. The data used for the analysis is massive and therefore the features, i.e., the genes, must be reduced to a reasonable level due to the computational cost of experiments and the possibility of misleading irrelevant genes. Therefore, usually, the analysis based on the classification of cell samples includes a feature subset selection phase. This thesis aims to develop a tool that can be used during the feature subset selection phase of such analyses. Three novel algorithms are proposed for the gene selection problem based on basic association rule mining. The first algorithm starts with fuzzy partitioning of the gene expression data and discovers highly confident IF-THEN rules that enable the classification of sample tissues. The second algorithm search the possible IFTHEN rules based on a heuristic pruning approach which is based on the beam search algorithm. Finally, the third algorithm focuses on the hierarchical information carried through gene expressions by constructing decision trees based on different performance measures. We found satisfactory results in Leukemia Dataset. In addition, in colon cancer dataset, algorithm that is based on construction of decision trees showed good performance.
Item Type: Thesis
Uncontrolled Keywords: Feature subset selection. -- Association rule mining. -- Fuzzy logic. -- Pattern classification. -- Gene selection. --Özellik altkümesi seçimi. -- Kural madenciliği. -- Bulanık mantık. -- Patern sınıflandırma.
Subjects: T Technology > T Technology (General) > T055.4-60.8 Industrial engineering. Management engineering
Divisions: Faculty of Engineering and Natural Sciences > Academic programs > Manufacturing Systems Eng.
Faculty of Engineering and Natural Sciences
Depositing User: IC-Cataloging
Date Deposited: 18 Jan 2011 11:38
Last Modified: 26 Apr 2022 09:53
URI: https://research.sabanciuniv.edu/id/eprint/16313

Actions (login required)

View Item
View Item