政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/80326

政大典藏 > College of Informatics > Executive Master Program of Computer Science of NCCU > Theses > Item 140.119/80326

Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/80326

Title:	由食譜資料探勘料理特徵樣式 Mining Cuisine Patterns from Recipe Dataset
Authors:	呂耀茹
Contributors:	沈錳坤呂耀茹
Keywords:	巨量資料資料探勘食譜料理
Date:	2015
Issue Date:	2016-01-04 16:58:11 (UTC+8)
Abstract:	近年來越來越多人基於健康理由，自己動手烹調料理，也帶動食譜社群網站的成長。雖然隨著Big Data議題受到注目，Data Mining在近年來相當熱門，然而針對食譜的巨量資料探勘與分析研究並不多。本研究由網路擷取國外知名食譜網站Allrecipes.com、Food.com及Yummly.com的食譜資料，探勘世界主要料理的食材樣式與特性，包括料理口味、常用食材、特色食材、核心食材、食材搭配關係、料理間相似度與分群、及料理自動分類。針對資料前處理，本論文提出結合食材詞庫並利用連通單元標籤演算法，提出解決食材同義詞的方法。為了探勘料理的食材樣式與特性，本研究透過網絡分析、關連規則、Phi, PMI等方法來探勘分析各種料理的特色食材、核心食材與食材搭配樣式。此外，本論文依據料理食材之相似度，並結合階層式分群技術，有別於一般以地理位置來群聚各類料理。本論文也提出運用階層式分類技術，以根據食材來自動判斷食譜的料理種類。透過食譜網站的大量的使用者產生資料，探勘分析世界各種料理的樣式與特性，將可了解各種料理的風格與特色，進而應用在食譜網站的資料管理與查詢。
Reference:	[1] Rakesh Agrawal and Ramakrishnan Srikant, Fast Algorithms for Miningssociation Rules, International Conference on Very Large Data Bases, VLDB, 1994. [2] Yong Yeo. Ahn, Sebastian E. Ahnert, James P. Bagrow, and Albert László Barabasi, Flavor Network and the Principles of Food Pairing, Scientific Reports, Vol.1, 2011. [3] Florian Beil, Martin Ester, and Xiaowei Xu, Frequent Term-based Text Clustering. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002. [4] Steven Bird, Klein Ewan, and Edward Loper. Natural Language Processing with Python, O`Reilly Media, Inc., 2009. [5] Stephen P. Borgatti, Centrality and Network Flow, Social Networks, Vol. 27 No.1, 2005. [6] Corrado Boscarino, N. J. Koenderink, V. Nedović, and J. L. Top, Automatic extraction of ingredient`s substitutes. ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. ACM, 2014. [7] L. Breiman, Random Forests, Machine Learning, Vol. 45, 2001. [8] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson, Introduction to Algorithms (the 2nd Edition), McGraw-Hill, 2001. [9] Karam Gouda and Mohammed Zaki, Efficiently Mining Maximal Frequent Itemsets, IEEE International Conference on Data Mining, 2001. [10] Jaiwei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001. [11] Anna Huang, Similarity Measures for Text Document Clustering, Sixth New Zealand Computer Science Research Student Conference, Christchurch, New Zealand, 2008. [12] James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela H. Byers, Big Data: the Next Frontier for Innovation, Competition, and Productivity, McKinsey & Company, 2011. [13] Rada Mihalcea, Courteny Corley, and Carlo Strapparava, Corpus-based and Knowledge-based Measures of Text Semantic Similarity. In, AAAI, 2006. [14] Trung Duc Nguyen, Diep Thi-Ngoc Nguyen, and Yasushi Kiyoki, A Regional Food`s Features Extraction Algorithm and Its Application, International Workshop on Multimedia for Cooking & Eating Activities, 2013. [15] Tore Opsahl, Filip Agneessens, and John Skvoretz, Node Centrality in Weighted Networks: Generalizing Degree and Shortest Paths, Social Networks Vol. 32, 2010. [16] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993. [17] Carlos N. Silla Jr., and Alex A. Freitas, A Survey of Hierarchical Classification across Different Application Domains, Data Mining and Knowledge Discovery, Vol. 22, 2011. [18] Han Su, Ting-Wei Lin, Cheng-Te Li, Man-Kwan Shan, and Janet Chang, Automatic Recipe Cuisine Classification by Ingredients, ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, 2014. [19] Aixin Sun, Ee-Peng Lim, and Wee-Keong Ng, Performance Measurement Framework for Hierarchical Text Classification, Journal of the American Society for Information Science and Technology, Vol. 54, 2003. [20] Chun-Yuen Teng, Yu-Ru Lin, and Lada A. Adamic, Recipe Recommendation Using Ingredient Networks, ACM Web Science Conference, 2012. [21] Kristin M. Tolle, D. Stewart W. Tansley, and Anthony J. Hey, The fourth paradigm: Data-intensive scientific discovery [point of view]. IEEE, Vol. 99, 2011. [22] Lav R. Varshney, Florian Pinel, Kush R. Varshney, Debarun Bhattacharjya, Angela Schörgendorfer, and Yi-Min Chee, A Big Data Approach to Computational Creativity, arXiv preprint arXiv1311.1213 (2013). [23] Kush R. Varshney, Lav R. Varshney, Jun Wang, and Daniel Myers, Flavor Pairing in Medieval European Cuisine: A study in Cooking with Dirty Data, International Joint Conference on Artificial Intelligence Workshops, 2013. [24] Liping Wang, Qing Li, Na Li, Guozhu Dong, and Yu Yang, Substructure Similarity Measurement in Chinese Recipes. International Conference on World Wide Web, 2008. [25] Yan Xu, Gareth Jones, JinTao Li, Bin Wang, and ChunMing Sun, A Study on Mutual Information-Based Feature Selection for Text Categorization, Journal of Computational Information Systems, Vol. 3, 2007. [26] Gephi in https://gePhi.org [27] Libsvm :http://www.csie.ntu.edu.tw/~cjlin/libsvm/ [28] Phi wiki introduction, retrieved June 20 2015 from the World Wide Web https://en.wikipedia.org/wiki/Phi. [29] Stanford Parser. http://nlp.stanford.edu/software/lex-parser [30] SVM wiki introduction, retrieved June 18 2015 from the World Wide Web https://en.wikipedia.org/wiki/Support_vector_machine [31] Weka: http://www.cs.waikato.ac.nz/ml/weka/
Description:	碩士國立政治大學資訊科學系碩士在職專班 102971008
Source URI:	http://thesis.lib.nccu.edu.tw/record/#G0102971008
Data Type:	thesis
Appears in Collections:	[Executive Master Program of Computer Science of NCCU] Theses

Files in This Item:

File	Size	Format
100801.pdf	5256Kb	Adobe PDF2	232	View/Open

社群 sharing

Loading...