政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/80326
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  全文笔数/总笔数 : 113311/144292 (79%)
造访人次 : 50924314      在线人数 : 892
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    请使用永久网址来引用或连结此文件: https://nccur.lib.nccu.edu.tw/handle/140.119/80326


    题名: 由食譜資料探勘料理特徵樣式
    Mining Cuisine Patterns from Recipe Dataset
    作者: 呂耀茹
    贡献者: 沈錳坤
    呂耀茹
    关键词: 巨量資料
    資料探勘
    食譜
    料理
    日期: 2015
    上传时间: 2016-01-04 16:58:11 (UTC+8)
    摘要: 近年來越來越多人基於健康理由,自己動手烹調料理,也帶動食譜社群網站的成長。雖然隨著Big Data議題受到注目,Data Mining在近年來相當熱門,然而針對食譜的巨量資料探勘與分析研究並不多。
    本研究由網路擷取國外知名食譜網站Allrecipes.com、Food.com及Yummly.com的食譜資料,探勘世界主要料理的食材樣式與特性,包括料理口味、常用食材、特色食材、核心食材、食材搭配關係、料理間相似度與分群、及料理自動分類。
    針對資料前處理,本論文提出結合食材詞庫並利用連通單元標籤演算法,提出解決食材同義詞的方法。為了探勘料理的食材樣式與特性,本研究透過網絡分析、關連規則、Phi, PMI等方法來探勘分析各種料理的特色食材、核心食材與食材搭配樣式。此外,本論文依據料理食材之相似度,並結合階層式分群技術,有別於一般以地理位置來群聚各類料理。本論文也提出運用階層式分類技術,以根據食材來自動判斷食譜的料理種類。
    透過食譜網站的大量的使用者產生資料,探勘分析世界各種料理的樣式與特性,將可了解各種料理的風格與特色,進而應用在食譜網站的資料管理與查詢。
    參考文獻: [1] Rakesh Agrawal and Ramakrishnan Srikant, Fast Algorithms for Miningssociation Rules, International Conference on Very Large Data Bases, VLDB, 1994.
    [2] Yong Yeo. Ahn, Sebastian E. Ahnert, James P. Bagrow, and Albert László Barabasi, Flavor Network and the Principles of Food Pairing, Scientific Reports, Vol.1, 2011.
    [3] Florian Beil, Martin Ester, and Xiaowei Xu, Frequent Term-based Text Clustering. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002.
    [4] Steven Bird, Klein Ewan, and Edward Loper. Natural Language Processing with Python, O`Reilly Media, Inc., 2009.
    [5] Stephen P. Borgatti, Centrality and Network Flow, Social Networks, Vol. 27 No.1, 2005.
    [6] Corrado Boscarino, N. J. Koenderink, V. Nedović, and J. L. Top, Automatic extraction of ingredient`s substitutes. ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication. ACM, 2014.
    [7] L. Breiman, Random Forests, Machine Learning, Vol. 45, 2001.
    [8] Thomas H. Cormen, Clifford Stein, Ronald L. Rivest, and Charles E. Leiserson, Introduction to Algorithms (the 2nd Edition), McGraw-Hill, 2001.
    [9] Karam Gouda and Mohammed Zaki, Efficiently Mining Maximal Frequent Itemsets, IEEE International Conference on Data Mining, 2001.
    [10] Jaiwei Han and Micheline Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann Publishers, 2001.
    [11] Anna Huang, Similarity Measures for Text Document Clustering, Sixth New Zealand Computer Science Research Student Conference, Christchurch, New Zealand, 2008.
    [12] James Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela H. Byers, Big Data: the Next Frontier for Innovation, Competition, and Productivity, McKinsey & Company, 2011.
    [13] Rada Mihalcea, Courteny Corley, and Carlo Strapparava, Corpus-based and Knowledge-based Measures of Text Semantic Similarity. In, AAAI, 2006.
    [14] Trung Duc Nguyen, Diep Thi-Ngoc Nguyen, and Yasushi Kiyoki, A Regional Food`s Features Extraction Algorithm and Its Application, International Workshop on Multimedia for Cooking & Eating Activities, 2013.
    [15] Tore Opsahl, Filip Agneessens, and John Skvoretz, Node Centrality in Weighted Networks: Generalizing Degree and Shortest Paths, Social Networks Vol. 32, 2010.
    [16] J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann Publishers, 1993.
    [17] Carlos N. Silla Jr., and Alex A. Freitas, A Survey of Hierarchical Classification across Different Application Domains, Data Mining and Knowledge Discovery, Vol. 22, 2011.
    [18] Han Su, Ting-Wei Lin, Cheng-Te Li, Man-Kwan Shan, and Janet Chang, Automatic Recipe Cuisine Classification by Ingredients, ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, 2014.
    [19] Aixin Sun, Ee-Peng Lim, and Wee-Keong Ng, Performance Measurement Framework for Hierarchical Text Classification, Journal of the American Society for Information Science and Technology, Vol. 54, 2003.
    [20] Chun-Yuen Teng, Yu-Ru Lin, and Lada A. Adamic, Recipe Recommendation Using Ingredient Networks, ACM Web Science Conference, 2012.
    [21] Kristin M. Tolle, D. Stewart W. Tansley, and Anthony J. Hey, The fourth paradigm: Data-intensive scientific discovery [point of view]. IEEE, Vol. 99, 2011.
    [22] Lav R. Varshney, Florian Pinel, Kush R. Varshney, Debarun Bhattacharjya, Angela Schörgendorfer, and Yi-Min Chee, A Big Data Approach to Computational Creativity, arXiv preprint arXiv1311.1213 (2013).
    [23] Kush R. Varshney, Lav R. Varshney, Jun Wang, and Daniel Myers, Flavor Pairing in Medieval European Cuisine: A study in Cooking with Dirty Data, International Joint Conference on Artificial Intelligence Workshops, 2013.
    [24] Liping Wang, Qing Li, Na Li, Guozhu Dong, and Yu Yang, Substructure Similarity Measurement in Chinese Recipes. International Conference on World Wide Web, 2008.
    [25] Yan Xu, Gareth Jones, JinTao Li, Bin Wang, and ChunMing Sun, A Study on Mutual Information-Based Feature Selection for Text Categorization, Journal of Computational Information Systems, Vol. 3, 2007.
    [26] Gephi in https://gePhi.org
    [27] Libsvm :http://www.csie.ntu.edu.tw/~cjlin/libsvm/
    [28] Phi wiki introduction, retrieved June 20 2015 from the World Wide Web https://en.wikipedia.org/wiki/Phi.
    [29] Stanford Parser. http://nlp.stanford.edu/software/lex-parser
    [30] SVM wiki introduction, retrieved June 18 2015 from the World Wide Web https://en.wikipedia.org/wiki/Support_vector_machine
    [31] Weka: http://www.cs.waikato.ac.nz/ml/weka/
    描述: 碩士
    國立政治大學
    資訊科學系碩士在職專班
    102971008
    資料來源: http://thesis.lib.nccu.edu.tw/record/#G0102971008
    数据类型: thesis
    显示于类别:[資訊科學系碩士在職專班] 學位論文

    文件中的档案:

    档案 大小格式浏览次数
    100801.pdf5256KbAdobe PDF2232检视/开启


    在政大典藏中所有的数据项都受到原著作权保护.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈