Çok Amaçlı Genetik Algoritma ile Karışık Verilerin Sınıflandırılması

Sert, Onur Can

Please use this identifier to cite or link to this item: https://hdl.handle.net/20.500.11851/681

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Özyer, Tansel	-
dc.contributor.author	Sert, Onur Can	-
dc.date.accessioned	2019-03-12T18:58:53Z
dc.date.available	2019-03-12T18:58:53Z
dc.date.issued	2012	-
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/tezSorguSonucYeni.jsp	-
dc.identifier.uri	https://hdl.handle.net/20.500.11851/681	-
dc.description.abstract	Son yıllarda gittikçe büyüyen veri kümeleri içerisinden kullanıcının işine yarayacak olan saklı bilgiye ulaşmak ve çıkarmak gittikçe önemini arttıran bir araştırma konunusudur. Bu bilgiler üzerinden veriler arasında bulunan ilişkiler saptanabilir ve çeşitli yöntemler kullanılarak bu verilerin öbeklenmesi ve sınıflandırılması sağlanabilir. Bu bilgilerin çıkartılması adına bir çok algoritma geliştirilmiştir ve bu işlemler şu anda bankacılık, biyoenformatik, sağlık sektörü ve benzeri bir çok alanda aktif olarak kullanılmaktadır. Sadece numerik veya sadece kategorik öznitelikler içeren veri kümeleri için bu öbekleme işlemlerini yapan k ? means, k ? modes gibi algoritmalar mevcuttur fakat numerik ve kategorik özniteliklerin karışık olarak yer aldığı veri kümeleri için çözüm üreten çok sayıda yöntem bulunmamaktadır. Bu tezde karışık özniteliklerden oluşan veri kümelerinin öbeklenmesine yönelik bir araştırma yapılmış ve bu doğrultuda bir çözüm yöntemi önerilmiştir. Önerilen çözüm yönteminde karışık öznitelikler içeren veri kümeleri özniteliklerinin türleri doğrultusunda ayrılmakta ve değerlendirilmekte daha sonra ise numerik ve kategorik olarak ayrı ayrı alınan sonuçlar birleştirilerek sonuca ulaşılmaktadır. Bu işlemlerin yapılabilmesi adına numerik ve kategorik öznitelikler için farklı uzaklık (benzerlik) metrikleri tanımlanmıştır. Son olarak ise tanımlanan bu uzaklık metrikleri bir k ? means yapısına oturtularak istenilen algoritma elde edilmiştir. Bu algoritmadan elde edilen sonuçlar üzerinden çeşitli metrikler doğrultusunda ideal öbek sayıları tespit edilmeye çalışılmış ve elde edilen sonuçların başarımları saflık metriği adı verilen bir metrik hesaplanmış ve farklı yöntemler ile elde edilen sonuçlarla karşılaştırılmıştır.	en_US
dc.description.abstract	Collecting and extracting the useful information for users from the datasets becomes very popular and important among the research areas of computer sciences. For using the extracted information people can easily create links between the different data and make clustering or classification operations with them. In order to do that information extraction process, there are remarkable number of algorithms are developed and they are used in areas like banking, bioinformatics and medicine. There are lot of algorithms which are do clustering operations for datasets which are included only numerical attributes or only categorical attributes. However the number of the algortihms convenient for the mixed datasets, which are included both numerical and categorical attributes, are very low. In this thesis, it has been stutied on developing a new clustering algorithm for all the three types (numerical, categorical and mixed) of datasets. The algorithm which is proposed is seperating the types of the attributes as numerical and categorical, calculating the distances between the data and returning a clustering result. For calculating the distance between two datum, there are fitness functions. Fitness functions are also seperated for numerical and categorical attributes and they are use in the same way as the fitness functions in the k ? modes and k ? means algorithm. Finally the clustering results, which are returned from the algorithm, are evaluated and the optimal clustering numbers are detected. The success of the results are evaluated with purity index and they are compared with the results of the other algorithms.	en_US
dc.language.iso	en	en_US
dc.rights	info:eu-repo/semantics/openAccess	en_US
dc.subject	Data mining	en_US
dc.subject	Computational methods	en_US
dc.subject	Genetic algorithms	en_US
dc.title	Çok Amaçlı Genetik Algoritma ile Karışık Verilerin Sınıflandırılması	en_US
dc.title.alternative	Clustering Mixed Datasets Using Multi Objective Genetic Algorithm	en_US
dc.type	Master Thesis	en_US
dc.department	Institutes, Graduate School of Engineering and Science	en_US
dc.department	Enstitüler, Fen Bilimleri Enstitüsü	en_US
dc.relation.publicationcategory	Tez	en_US
dc.identifier.scopusquality	N/A	-
dc.identifier.wosquality	N/A	-
item.cerifentitytype	Publications	-
item.languageiso639-1	en	-
item.grantfulltext	open	-
item.openairetype	Master Thesis	-
item.openairecristype	http://purl.org/coar/resource_type/c_18cf	-
item.fulltext	With Fulltext	-
Appears in Collections:	Bilgisayar Mühendisliği Yüksek Lisans Tezleri / Computer Engineering Master Theses