政大機構典藏-National Chengchi University Institutional Repository(NCCUR):Item 140.119/100572
English  |  正體中文  |  简体中文  |  Post-Print筆數 : 27 |  Items with full text/Total items : 113318/144297 (79%)
Visitors : 51000930      Online Users : 934
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: https://nccur.lib.nccu.edu.tw/handle/140.119/100572


    Title: 跨平台社群媒體圖文檢索系統之設計與實作
    Design and Implementation of a Text and Image Retrieval Tool for Cross-Platform Social Media Content
    Authors: 許展嘉
    Hsu, Chan Chia
    Contributors: 陳恭
    Chen, Kung
    許展嘉
    Hsu, Chan Chia
    Keywords: 社群媒體
    搜尋引擎
    資訊檢索
    social media
    Elasticsearch
    information retrieval
    Date: 2016
    Issue Date: 2016-08-22 13:40:52 (UTC+8)
    Abstract: 本校數位人文研究學者歷年研究中收集了選舉、災難、社運等重大事件的社群媒體文本資料,包含Twitter、Facebook、批踢踢BBS(PTT),以及網路即時新聞等來源。這些大量的話語資料反應了網路社群大眾及新聞媒體在重大事件發生時的意見、情緒與互動狀態,非常具有研究價值。但對於這些大量社群媒體文本內容一直未能做充分地分析,其主要原因在於缺乏有效的資料檢索系統來幫助他們探索與研究來自不同媒體來源的文本內容。
    因此本論文設計並建置一個跨媒體來源的資料檢索系統,依據所收集到的Twitter、Facebook、批踢踢、即時新聞之文本的data與metadata(後設資料)的特性,經由資料欄位重新定義、關聯式資料轉換、中文斷詞等機制,將data轉換成適合中文檢索的資料集,再透過Elasticsearch這個開放源碼的搜尋引擎進行鉅量資料的搜尋,建立一個具有彈性資料查詢界面與使用者的管理機制。方便數位人文研究學者可以針對資料集、關鍵字詞、圖片、時間區間等等,快速的搜尋各社群媒體文本內容,並藉由視覺化檢索成果展示不同社群媒體上對特定事件關注程度及反應狀況,為跨平台社群媒體圖文檢索做一整合資料來源管道奠定基石。
    In the past few years, digital humanities researchers in our school have collected a huge amount of social media text data about major public events such as elections, disasters, social movements from various sources, namely Twitter, Facebook, PTT, real-time news. These text data can reflect largely the opinions, emotion, and interaction state, of the network community at the time of major events, thus being considered as valuable research assets. However, due to the lack of a proper information retrieval tool, these researchers have not been able to launch any in-depth studies on these social media text data.
    Therefore, this thesis presents the design and implement an information retrieval system for these cross-media data sets based on the popular search engine, Elasticsearch. Our system first preprocesses the data and meta-data of these social media texts into a unified yet flexible data schema before building their indices in a way that users can search the full text from both the data proper and various meta-data attributes such as date of publication and authors. We also provide some visualization aid to display the search results in a user-friendly manner. Overall, our system serves as a good tool for researchers to explore the social media text data from various sources in an easy yet effectively way.
    Reference: 1. NoSQL, from:http://zh.wikipedia.org/wiki/NoSQL
    2. Elasticsearch-Definitive-Guide, from:https://github.com/elastic/elasticsearch-definitive-guide
    3. Elasticsearh Reference, from:https://www.elastic.co/guide/en/elasticsearch
    4. jQuery, from:http://jquery.com
    5. Spring Framework, from: https://projects.spring.io/spring-framework
    6. Hibernate, from: http://hibernate.org/
    7. Echarts, from:http://echarts.baidu.com/
    8. Jest, from: https://github.com/searchbox-io/Jest/tree/master/jest
    9. PostgreSQL, from: https://www.postgresql.org/
    Description: 碩士
    國立政治大學
    資訊科學系碩士在職專班
    103971006
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0103971006
    Data Type: thesis
    Appears in Collections:[Executive Master Program of Computer Science of NCCU] Theses

    Files in This Item:

    File SizeFormat
    100601.pdf3763KbAdobe PDF2373View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告 Copyright Announcement
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    The digital content of this website is part of National Chengchi University Institutional Repository. It provides free access to academic research and public education for non-commercial use. Please utilize it in a proper and reasonable manner and respect the rights of copyright owners. For commercial use, please obtain authorization from the copyright owner in advance.

    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    NCCU Institutional Repository is made to protect the interests of copyright owners. If you believe that any material on the website infringes copyright, please contact our staff(nccur@nccu.edu.tw). We will remove the work from the repository and investigate your claim.
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback