Transcription

International Journal of Computer Engineering and Information TechnologyVOL. 9, NO. 1, January 2017, 15–23Available online at: www.ijceit.orgE-ISSN 2412-8856 (Online)Multi-word Aspect Term Extraction Using Turkish User ReviewsEkin Ekinci1, Hazal Türkmen2 and Sevinç İlhan Omurca31, 2, 3Department of Computer Engineering, Kocaeli University, Kocaeli, İzmit, 41380 [email protected], [email protected], [email protected], when an individual wants to buy any productor a company wants to take the pulse of public opinionabout its product, user reviews of this product havebecome a valuable source of information. As aconsequence of that, aspect based sentiment analysis hasbecome popular research field which has also attracted theattention of researchers. In this study, we devised a methodwhich extracts multi-word aspects from the Turkish userreviews. To investigate the reliability and the performanceof the system, the frequency basis method based on Ngram by unifying finite state automata which are set forthe recognition of the Turkish grammar rules werepreferred. The success of the system was measured byusing cell phones and by using hotel reviews. As a result,the success obtained is averagely 82% for cell phonedomain and averagely 79% for hotel domain.Keywords: Aspect Bases Sentiment Analysis, Aspect Extraction,Multi-word Aspect Extraction, Finite State Automata.1. INTRODUCTIONAutomatic sentiment analysis of online customer reviewshas become a needed research area due to the rapid growthof user-generated reviews on the internet. The institutionswho manage voice of the customer and social mediainsight programs particularly require the sentimentanalysis of online reviews. There is wide range of productsand services being reviewed on the Web, thus Web hasbecome an excellent source for extracting customerreviews. Due to the huge number of online reviewsregarding products and services, the supervised machinelearning methods are not practical to implement. As thenumber of reviews expands, it is essential to develop anunsupervised sentiment analysis model which is capable ofextracting the product aspects and determining thesentiments for these. Aspect extraction is a critical stage insentiment analysis, therefore, for an overall sentimentanalysis of the reviews, a word or multi-word aspects(MWA) of the products must be detected.For the recent years, sentiment analysis for online reviewshas attracted a great deal of attentions from researchersand product manufacturers [1-4]. Product manufacturersneed online reviews to understand the general responses ofcustomers to their products for finding weaknesses of theproducts and improving their products accordingly. Ananalysis has been being developed to define the tendencyof the customers however the need for a new method wasborn when we figure out that this kind of analysis is tooinadequate to precise weak and strong sides of the productspecifications.Along with getting generally a negative or positive resultof a product in a document leveled analysis, it is notcorrect to assume that if the product result is negative; allthe product specifications are weak. Accordingly it doesn'tmean that all the product specifications are fine if theproduct is resulted positively. Sentiment classificationbased on aspect is a much more correct approach seeingthat it gives sentiments separately for the productspecifications instead of reflecting general sentimentsabout the product. What we mean by the word 'aspect' isanything that defines, completes a product. In aspect basedsentiment analysis, as the sentences not including the wordaspect doesn't mean anything in terms of sentimentanalysis, it is also possible to use the sentiments in reviewseffectively with this method.In this paper, we concentrate on MWA extraction fromcustomer reviews and propose a novel unsupervised anddomain-independent hybrid model for detecting TurkishMWAs. There is no known study realized with thisobjective until today. The unsupervised and domainindependent product abolishes the need for labeled dataduring the process of excluding MWAs. Seeing that thismethod based on N-gram and heuristic rules particular to alanguage is easily compatible with any language, it makespossible to adjust the system for other languages. Asuccessful system has been achieved thanks to PMImethod which is mostly preferred for natural languageprocessing (NLP). In order to see the performance of theproposed system, a human-generated MWA corpus is used.

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 201716Ekin Ekinci et. alThe precision and recall values which determine thecoverage between the human-generated and automaticallygenerated MWAs are used as performance evaluator.The rest of the paper is organized as follows: Section 2reviews the literature; Section 3 explains the proposedMWA extraction model. The experimental resultspresented in Section 4. Finally, discussion and conclusionsfor the future work are summarized in Section 5.2. RELATED WORKSWhen the literature is evaluated it is shown that aspectextraction is one of the cornerstones of the sentimentanalysis studies. To design a powerful sentiment analysissystem, aspect extraction process should be carried outsuccessfully. To better understand the subject literature issummarized in this section.In both studies, Hu and Liu [5, 6], utilized a data miningalgorithm, which is called association rule mining basedon Apriori, to find all frequent itemsets. In this context, theitemset was a set of noun or a noun phrase that occurstogether. Among these itemsets to remove unlikely aspectsthey applied filtering techniques which were compactnesspruning and redundancy pruning. Popsecu and Etzioni [7]devised OPINE, an unsupervised information extractionsystem to extract noun phrases from reviews and thesenoun phrases were eliminated according to threshold. Toremove non-aspects their system evaluated each nounphrase by computing the PMI score between the phraseand some meronymy discriminators associated with theproduct class of interest. Yi and Niblack [8], developedheuristics and selection algorithms to extract feature termsfrom online reviews. A feature term can be a part-ofrelationship with the given topic, an attribute-ofrelationship with the given topic, an attribute-ofrelationship with a known feature of the given topic.Candidate feature term extraction, which was noun phrases,was achieved with bBNP heuristics and to select thefeature terms among these extracted terms, they used alikelihood ratio based algorithm. Wei et al. [9], proposed asemantic-based product feature extraction (SPE) technique.In this technique, subjective adjectives obtained fromGeneral Inquirer were used to eliminate non-productfeatures and opinion-irrelevant product features andexplore infrequent product features. Zhu et al. [10],constructed an aspect based opinion polling system. Intheir system, to extract candidate MWA related terms Cvalue method was preferred. C-value is often utilized asmulti-word extraction method. Bagheri et al. [2],performed an unsupervised domain-independent aspectdetection model for online reviews. After applying POStagging and stemming to extract candidate aspect, ageneralized statistical measure was performed for MWAs.In this study, extracted MWAs were ranked with wordscoring method which is called FLR. Yan et al. [11],developed EXPRSS method an extended pagerankalgorithm to extract product features from Chinese onlineconsumer reviews. In this study, noun and noun phrasesand dependency relations were identified with Chineselexical analysis tool (ICTCLAS). Candidate extraction wasperformed with NodeRank algorithm an extendedpagerank algorithm. Li et al. [12], enacted a method toextract candidate aspects based on frequent noun and nounphrases and PMI-IR score were used to prune among thesecandidates. While PMI score is computed between aspectand discriminator, PMI-IR score is computed betweenaspect and target entity. RCut algorithm was alsoimplemented for determination of threshold to selectcandidate aspects.3. THE PROPOSED MULTI-WORDASPECT EXTRACTION MODELThe system architecture designed within the scope of studyis shown in Fig. 1. The first step of the system is to precisethe web sites including the reviews of domain which willbe worked on and is to crawl the reviews from the relatedpages with a web crawler. When the reviews are examined,a lot of typos are detected. Since these typos affect theaspect determination negatively, the step of Syntax errorcorrection is realised with the help of Turkish NLP LibraryZemberek. After preprocessing, word N-grams areexcluded along with their frequencies from the document.Over precised N-grams, candidate MWA have beenidentified by realising stop word, digit and punctuationbased elimination. Candidate MWA set is simplified viafrequencies basis Compactness pruning method and afterthat by applying heuristic rules. The final MWA set hasbeen achieved with a PMI basis method of elimination todefine the aspects related to chosen application domain.

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 201717Ekin Ekinci et. alFig. 1. Overall process of multi-word aspect extraction system.3.1 N-grams3.3 Heuristic RulesN-gram basis method is often preferred in NLPapplications for being powerful and easy to use. While itsindependence from languages and low grammarknowledge necessity provides an easy realisation, itssuccessful results show that this method is powerful.N-grams are named as a series of groups composed of Nmembers in document. These members are identified ascharacters composing the document (letters, punctuation,space etc.), words, POS labels or specifications that canbe found successively. The selection of a member can bechange according to objective of the study. Character Ngrams are usually preferred for author, title and languageidentification applications [13-15]. Word N-grams arepreferred for finding and using multi-words [16, 17]. Postag is among the methods called N-grams [18].In this study, word N-grams are used. The objective of thestudy is to achieve MWAs in a certain review cluster.When MWAs in Turkish are examined, it is clear thatthey are usually composed of two words. This is why N 2is chosen in this study (battery charger, memory card,customer satisfaction etc). The binary combinations in allthe reviews are excluded with their frequencies.3.2 Compactness PruningIt is aimed to realise a filtration and compactness pruning[5, 6] through the combinations and their frequencies gotby N-gram. According to compactness pruning method,the combinations that are precised as MWA by using Ngram are eliminated according to a threshold value.It might be impossible for any notion to be explained indetails, to be talked about its specifications. To be able toovercome this situation, it might be necessary that a lot ofwords come together under some rules. As for anexplication of this situation over a computer notion, wecan show as examples computer cabinets, computer prices,computer engineering etc. However; more than one wordsmight not always come together to give details about anotion. Some notions correspond to more than one words.For example; holiday camp, video card, dishwashingmachine etc. In some cases, certain words come togetherto modificate a notion and compose a phrase. In theexamples of boiled corn, kissable hands, familiar faces,cold weather, we see that first words compose a phrase bymodification of second ones. In brief, the words gatheredunder certain rules for expressing notions in details, fornotions expressed with more than one words and formodificating notions compose phrases [19]. When theserules are examined, it is seen that the word that is aimedto emphasized is found at the very end and theemphasizing word or words are found at the beginning. Innoun phrases, while the word aimed to emphasized whichis at the end is named determinated, the word(s)emphasizing is named determining. As compound nounsare categorized in two (adjective or noun) according totype of the determining, possessive construction aredivided in three according to the suffix that determiningtakes [20].

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 201718Ekin Ekinci et. al Defined compound nouns: In defined compoundnouns, both determinated and determining arenouns. Both of them takes compound suffixes;determinated takes its suffixes (ın,in,un,ün) anddetermining takes its (ı,i,u,ü). Example: "Okulunbahçesi (school's garden)", "Bilgisayarın kasası(computer's cabinet)".Undefined compound nouns: In undefinedcompound nouns, both determinated anddetermining are nouns. Only determining takessuffixes which are (ı,i,u,ü). Example: Ekrançözünürlüğü (screen resolution), domates salçası(tomato paste), çilek reçeli (strawberry jam),biber turşusu (pepper pickle).Compound nouns with no suffixes: In compoundnouns with no suffixes, both determinated anddetermining are nouns. Neither of them takessuffixes. Example: Ölü deniz (Dead Sea).A Finite State Automata (FSA) represents the relationalpatterns of noun phrases and adjective clauses. Thesepatterns rely on the simple pos-tagger outputs i.e. theyconsist of pos-tags. A Finite State Automata (FSA) beginsfrom one of the states (called the start state), goes throughtransitions depending on inputs to different states and endin one certain set of states marking a successful flow ofoperation (called final states). An example of a simplenoun phrase automata that can recognize noun phrases inTurkish is illustrated in Fig. 2.Fig. 2. A FSA of noun phrases. Adjectival determinatives: The participles arecomposed as getting suffixes at the end (-en,esi,-mez,-ar,-dik,-ecek,-miş). The participles,just like adjectives, come before nouns andcompose an adjective clause. Example:Haşlanmış mısır (boiled corn), tanıdık yüzler(familiar faces). In this study, rules foradjectival determinatives are taken intoconsideration and an automata for this isillustrated in Fig. 3.Fig. 3. A FSA of adjectival determinatives.3.4 PMIIt is necessary to determine if each member of candidateMWA set is associated with the main entity (hotel andcell phone). For this purpose, PMI which is a criterionoften used to measure the relation between words andthe main entity in natural language processing [3]. Formain entity let's say d and for any specification incandidate aspect set let's say Ai , in that case the here,relation between Ai and d is defined with PMI methodas in Equation 1.PMI( Ai , d ) hits( Ai , d ) / hits Ai hits d (1)Here, hits( Ai , d ) is co-occurrence number of Ai and dand hits Ai hits d is the co-occurrence number of the

19International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 2017Ekin Ekinci et. alAi and d if they are statistically independent. The valuewhich is calculated for every single specification iscompared with the predefined threshold value. MWAthat cannot reach to the threshold value are excluded.Let's suppose that candidate aspect set is presented withA A1 , A2 ,., Al . Then the PMI threshold value that isused to determine if any aspect is related enough to thechosen domain is calculated with this formula:lPMIthreshold(d ) (1 / l ) PMI( A , d )i(2)i 1This kind of calculation of PMI threshold valueprovides achievement of all the MWA lists from themost frequent to the most rare.4. EXPERIMENTAL RESULTSIn this section we evaluate the experimental result of theproposed MWA extraction model. There are not anybenchmarking methods for aspect extraction process inTurkish. Therefore we only compare our results by theresults obtained by human experts.4.1 Dataset Collection and DescriptionPrevious studies on aspect extraction have mostlyevaluated English reviews, because there areconventional dataset of online reviews in English forseveral domains. However there is no study realisingaspect extraction by using Turkish reviews. In this studywe conduct the experiment with Turkish reviews toextract MWAs. Therefore we crawled user reviewsfrom frequently visited web pages, such a.com for cell phone reviews. Eachdatasets contain textual reviews which are randomlyselected from web pages collected by our crawler. Table1 presents the descriptive information about these datacollections.Table 1: Dataset descriptionDataset# of Reviews# of SentencesHotelCellPhone15186852# of MWAs461592631445Three human annotators were asked to extract MWAsfrom user reviews independently for each domain. Onlythe aspects that all annotators had agreed were includedin the final MWA set. The agreed aspect numbers oftwo domains are shown in the last column of Table 1.These data sets can be used in similar studies.4.2 Evaluation MetricsIn the previous researches the precision, recall and Fmeasure used as the metrics to assess the effectivenessof the proposed approaches. Alternatively, in this studythe aspect extraction model is considered as anInformation Retrieval (IR) system. In IR usually nodecision is made on whether a document is relevant orirrelevant to another document. Instead, a ranking of thedocuments is produced [21]. Aspect extraction system isalso an IR system where the most important aspects areextracted from a given domain. Based on this admission,our study examines not only the MWAs of the userreviews, but also the ranked list of the aspects accordingto their popularity and frequency in user reviews. Theranked list consideration of MWAs provides a betteranalysis about their importance with regard to useropinions.Given a user review D , the evaluation method firstcomputes relevance scores for all MWAs in D and thenproduces a ranking Rmwa mwa1 , mwa2 ,., mwan ofthese aspects based on their relevance scores. Themwa1 is the most relevant aspect to the review text andmwan is the most irrelevant aspect to the review text.The precision and recall values at each mwai in theranking are computed. A general representation ofranked MWA list is shown in Table 2.Table 2: General rank list (i)r(i)1. 1/11/n2.-1/21/n. .i.-.-.n. .Recall at position i denoted by r (i) is the fraction ofrelevant multi-word aspects from mwa1 to mwai inRmwa . The recall value is computed as in Equation 3.r (i) relevanti / Rmwa(3)Where, relevanti represents the number of relevantmulti-word aspects in the related range i . Precision atposition i , denoted by p(i) , is the fraction of multiword aspects fromm w1a to mwai in Rmwa andcomputed as in Equation 4.

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 201720Ekin Ekinci et. alp(i) relevanti / i(4)The computed precision and recall values enable theevaluation of the coverage among the manually andautomatically generated multi word aspects. Further, anaverage precision can be computed based on theprecision at each level in the ranking Rmwa as inEquation 5. p(i) / Rp avg (5)mwamwai Re levantmwaHere, Re levantmwa represents the set of relevant multiword aspects which are defined by human experts. Theperformance evaluation is made based on judgments ofhuman experts. The quality of a computer generatedmulti-word aspect list is tested due to precision andrecall values of a manually generated MWA list whichis represented by Re levantmwa in Equation 5.4.3 Empirical ResultsWhen suggested system model is applied to the cellphone reviews, for the beginning 14007 candidateMWA are found via N-gram method. After that, thecompactness pruning is realised according to thethreshold value which is determined as 27 by a humanexpert. As a result of this step, the number of candidateaspects is withdrawn to 863. When each of these aspectsare presented to FSA which is defined for noun andadjective phrases, MWAs that can make it to the finalstate are accepted as valid noun or adjective accordingto Turkish grammar. 97 MWA are recognized by FSA.Finally, A PMI list containing 45 MWA is achievedfrom heuristic list by operating PMI method. Theflowchart for the cell phone domain is displayed in Fig.4.The precision and recall numbers of PMI list containingfinal MWA's achieved for cell phones are shown inTable 3 as ranked list.Fig. 4. The flowchart for the cell phone domain.Table 3: MWA list for cell phone domainRank OrderAspect Name (English/Turkish)Agreed/Not Agreedp(i)r(i)1cell phone (cep telefonu) 10022smart phone (akıllı telefon) 10043battery charger (şarj aleti) 10074operating system (işletim sistemi) 10095charging adapter (şarj cihazı) 100116user manual (kullanım klavuzu) 10013 10guarantee certificate (garanti belgesi) 8018. 21telephone game (telefon oyun)-8138.

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 201721Ekin Ekinci et. al39screen quality (ekran kalitesi) 796940picture video (resim video)-786941ideal phone (ideal telefon)-766942super screen (ekran süper)-746943speech qualification (konuşma özelliği) 747144phone suggestion (telefon tavsiye)-737145video quality (video kalitesi) 7373As listed in Table 3, the “cell phone” domain contains 45MWAs. Our system ranked the “cell phone” as the mostcritical aspect and the “video kalitesi” as the less criticalaspect among them. As a result of our application, theaspects are ordered from most voted and to least voted asranked list. Moreover A system that will be useful tomake a more flexible evaluation for experts is developed.If there is an obligation for choosing a few MWAs, it ispossible to make a choice from top of the list as many aswanted. For example, when the first 6 aspects are chosenas a final set, both precision value and average precisionvalue of the system at 6th level will be 100%.Alternatively, for the first 10 aspects, while p(10) value80%, it will be 75,8%. When all of the list is chosen asfinal set, average precision value will be 82% as p(45)value is 73%. The precision and recall values of cellphone are shown in Table 3.For hotel domain, we got 46 MWAs from totally 10924candidate MWAs. Among the specified MWAs, whilethe most voted feature is ''food and drink'', the leastvoted is '' employee attention'' with a rank number 43.The features number 44, 45 and 46 are not taken intoconsideration as human expert don't evaluate them asaspects. When we consider the first 43 members arechosen as related MWAs, for final set the precision valueis 88% and recall value is 83%, as for average precisionvalue for hotel domain, it is calculated as 79,2%. Theobtained results for “hotel” domain are given in Table 4.Table 4: MWA list for hotel domainRank OrderAspect Name (English/Turkish)Agreed/Not Agreedp(i)r(i)1food and drink (yiyecek içecek) 10022.1819.43444546holiday camp (tatil köyü) quality of service(servis kalitesi)human relations (insan ilişkileri) employee concern (personel ilgisi)over the age of (yaş üstü)food beverage (yemek içecek)hotel full (otel dolu) -100 100100 888684834 3941 838383835. CONCLUSIONSIn these days, understanding the features of a productfrom its customer reviews become an importantapplication and necessity domain. Which features are themost eye catching for both potential customers andproducing company is a very fundamental point. In thisstudy, a system which provides automatic exploration ofbinary phrased aspects by using Turkish reviews ofproducts from different domains is exposed. Automaticdetection of MW As in hotel and cell phone domains isrealised with the use of Turkish customer reviews. It isobserved that the product features found in customerreviews are either single or binary phrases. In this study,instead of single phrases we mostly focused on gettingbinary phrased aspects which are dependent to naturallanguage. The existing heuristic rules for making Turkishbinary phrases are recognized as noun and adjective FSAsin this study. Moreover the recognized phrases are seenvalid in Turkish and are accepted as MWAs in our study.To form input MWAs for noun and adjective FSA, firstlyall the candidate MWAs in customer reviews are detected.Candidate MWA set is pruned with the Compactness

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 201722Ekin Ekinci et. alpruning method. The domain suitability of MWAsrecognized by FSAs is tested with PMI method.Ranked list method, which is a kind of informationretrieval evaluation method and which isn't used before insentiment analysis, is used for evaluating the success ofthe system. Achieving MWAs from customer reviews inhotel domain is realised with precision value of 79% andachieving MWAs from customer reviews in cell phonedomain is realised with precision value of 82%.[11][12][13]ACKNOWLEDGMENTSThis study is supported by The Scientific andTechnological Research Council of Turkey (TUBITAK)under project number 114E422.[14][15]REFERENCES[16][1] Brody, S., Elhadad, N., Unsupervised Aspect-SentimentModel for Online Reviews, In: Proceedings of the 2010Annual Conference of the North American Chapter ofthe ACL, Los Angeles, California, USA, 2010, pp 804812.[2] Bagheri, A., Saraee, M., de Jong, F., Care more aboutcustomers: Unsupervised domain-independent aspectdetection for sentiment analysis of customer reviews,Knowledge-Based Systems, 2013, 52, pp 201-213.[3] Zhang, W., Xu, H., Wan, W., Weakness Finder: Findproduct weakness from Chinese reviews by using aspectsbased sentiment analysis, Expert Systems withApplications, 2012, 39 (11), pp 10282-10291.[4] Shi, H., Zhan, W., Li, X., A Supervised Fine-GrainedSentiment Analysis System for Online Reviews,Intelligent Automation & Soft Computing, 2015, 21 (4),pp 589-605.[5] Hu, M., Liu, B., A Mining Opinion Features in CustomerReviews, In: Proceedings of 19th National Conferenceon Artificial Intelligence, California, USA, 2004, pp755-760.[6] Hu, M., Liu, B., Mining and Summarizing CustomerReviews, In: Proceedings of International Conference onKnowledge Discovery and Data Mining, Seattle,Washington, USA, 2004, pp 168-177.[7] Popescu, A. M., Etzioni, O., Extracting product featuresand opinions from reviews, In: Proceedings ofConference on Empirical Methods in Natural LanguageProcessing, Stroudsburg, PA, USA, 2005, pp 339-346.[8] Yi, J., Niblack, W., Sentiment Mining in WebFountain,In: Proceeding ICDE '05 Proceedings of the 21stInternational Conference on Data Engineering,Washington, USA, 2005, pp 1073-1083.[9] Wei, C., Chen, Y., Yang, C., Yang, C. C., Understandingwhat concerns consumers: a semantic approach toproduct feature extraction from consumer reviews,Information Systems and e-Business Management, 2010,8 (2), pp 149-167.[10] Zhu, J., Wang, H., Zhu, M., Tsou, B. K., Ma, M.,Aspect-Based Opinion Polling from Customer Reviews,[17][18][19][20][21]Affective Computing, IEEE Transactions on, 2011, 2 (1),pp 37-49.Yan, Z., Xing, M., Zhang, D., Ma, B., EXPRS: Anextended pagerank method for product feature extractionfrom online consumer reviews, Information &Management, 2015, 52 (7), pp 850-858.Li, S., Zhou, L., Li, Y., Improving aspect extraction byaugmenting a frequency-based method with web-basedsimilarity measures, Information Processing &Management, 2015, 51 (1), pp 58-67.Stamatatos, E., On the Robustness of AuthorshipAttribution Based on Character n-gram Features, Journalof Law and Policy, 2013, 21, pp 421-439.Gencosman, B., Ozmutlu, H., Ozmutlu, S., Character ngram application for automatic new topic identification,Information Processing & Management, 2014, 50 (6), pp821-856.Shi Hayta, Ş., Takçı, H., Eminli, M., LanguageIdentification Based on N-gram Feature ExtractionMethod by Using Classifiers, IU-Journal of Electrical &Electronics Engineering, 2013, 13 (2), pp 1629-1639.Antonia, A., Craig, H., Elliott, J., Language chunking,data sparseness, and the value of a long marker list:explorations with word n-grams and authorial attribution,Literary & Linguistic Computing, 2014, 29 (2), pp 147163.Baayen, H., Hendrix, P., Ramscar, M., Sidestepping theCombinatorial Explosion: An Explanation of N-gramFrequency Effects Based on Naive DiscriminativeLearning, Language and Speech, 2013, 56 (3), pp 329347.Hasan, F. M., UzZaman, N., Khan, M., Elleithy, K. (ed),Comparison of different POS Tagging Techniques (NGram, HMM and Brill’s tagger) for Bangla, In:Advances and Innovations in Systems, ComputingSciences and Software Engineering (Springer), 2007, pp121-126.Özkan, M., Sevinçli, V., Türkiye Türkçesi Söz Dizimi,İstanbul, 2008.Gökdayı, H., Türkiye Türkçesinde Öbekler, InternationalPeriodical For the Languages, Literature and History ofTurkish or Turkic, 2010, 5 (3), pp 1297-1319.Liu, B., Web Data Mining: Exploring Hyperlinks,Contents, and Usage Data, In: Data-Centric Systemswith Applications, 2011.AUTHOR PROFILES:Ekin Ekinci is a Research Assistant of Computer EngineeringDepartment at Kocaeli University in Turkey. She has receivedher BEng. in Computer Engineering from Çanakkale OnsekizMart University in 2009 and ME. in Computer Engineeringfrom Gebze Technical University in 2012. She is currentlyworking towards Ph.D. degree in Computer Engineering fromKocaeli University. Her main research interests include textmining, sentiment analysis, natural language processing andmachine learning.Hazal Türkmen received her BEng. in 2013 and ME. in 2012in Computer Engineering from Kocaeli University. She is

International Journal of Computer Engineering and Information Technology (IJCEIT), Volume 9, Issue 1, January 2017Ekin Ekinci et. alcurrently working towards Ph.D. degree in ComputerEngineering from Kocaeli University, Turkey. Her mainresearch interests include text mining, sentiment analysis,natural language processing and machin

preferred. The success of the system was measured by using cell phones and by using hotel reviews. As a result, the success obtained is averagely 82% for cell phone domain and averagely 79% for hotel domain. product is resulted positively. Sentiment classification Keywords: Aspect Base