Reference: | [1] Abid, A., Abdalla, A., Abid, A., Khan, D., Alfozan, A., Zou, J.: Gradio: Hasslefree sharing and testing of ml models in the wild. arXiv preprint arXiv:1906.02569(2019) [2] Aiello, M., Pegoretti, A.: Textual article clustering in newspaper pages. Applied Artificial Intelligence 20(9), 767–796 (2006). https://doi.org/10.1080/08839510600903858 [3] Clausner, C., Pletschacher, S., Antonacopoulos, A.: The significance of reading order in document recognition and its evaluation. 2013 12th International Conference on Document Analysis and Recognition 688–692 (2013) [4] Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale (2021) [5] Du, Y., Chen, Z., Jia, C., Yin, X., Zheng, T., Li, C., Du, Y., Jiang, Y.G.: Svtr: Scene text recognition with a single visual model (2022) [6] Egly, R., Driver, J., Rafal, R.: Shifting visual attention between objects and locations: evidence from normal and parietal lesion subjects. Journal of Experimental Psychology: General 123(2), 161–177 (jun 1994). https://doi.org/10.1037//0096- 3445.123.2.161 [7] Ferilli, S., Grieco, D., Redavid, D., Esposito, F.: Abstract argumentation for reading order detection. In: ACM Symposium on Document Engineering (2014) [8] Gu, Z., Meng, C., Wang, K., Lan, J., Wang, W., Gu, M., Zhang, L.: Xylayoutlm: Towards layout-aware multimodal networks for visually-rich document understanding (2022). https://doi.org/10.48550/ARXIV.2203.06947 [9] Ha, J., Haralick, R., Phillips, I.: Recursive x-y cut using bounding boxes of connected components. In: Proceedings of 3rd International Conference on Document Analysis and Recognition. vol. 2, 952–955 vol.2 (1995). https://doi.org/10.1109/ICDAR.1995.602059 [10] Howard, A., Sandler, M., Chen, B., Wang, W., Chen, L., Tan, M., Chu, G., Va- sudevan, V., Zhu, Y., Pang, R., Adam, H., Le, Q.: Searching for mobilenetv3. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV). 1314–1324. IEEE Computer Society, Los Alamitos, CA, USA (nov 2019). https://doi.org/10.1109/ICCV.2019.00140 [11] Iani, C., Nicoletti, R., Rubichi, S., Umiltà, C.: Shifting attention between objects. Cognitive Brain Research 11(1), 157–164 (2001). https://doi.org/10.1016/S0926-6410(00)00076-8 [12] KENDALL, M.G.: A NEW MEASURE OF RANK CORRELATION. Biometrika 30(1-2), 81–93 (06 1938). https://doi.org/10.1093/biomet/30.1-2.81 [13] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization (2014). https://doi.org/10.48550/ARXIV.1412.6980 [14] Kosinski, M.: Theory of mind may have spontaneously emerged in large language models (2023) [15] Kumar, R., Vassilvitskii, S.: Generalized distances between rankings. In: Proceedings of the 19th International Conference on World Wide Web. 571 –40 580. WWW ’10, Association for Computing Machinery, New York, NY, USA (2010). https://doi.org/10.1145/1772690.1772749 [16] Lamy, D., Egeth, H.: Object-based selection: The role of attentional shifts. Perception & Psychophysics 64(1), 52–66 (2002). https://doi.org/10.3758/BF03194557 [17] Li, L., Gao, F., Bu, J., Wang, Y., Yu, Z., Zheng, Q.: An end-to-end ocr text re-organization sequence learning for rich-text detail image comprehension. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) Computer Vision – ECCV 2020. 85–100. Springer International Publishing, Cham (2020) [18] Liao, M., Zou, Z., Wan, Z., Yao, C., Bai, X.: Real-time scene text detection with differentiable binarization and adaptive scale fusion (2022) [19] Liu, Z.Y.: Understanding of Printed Ancient Book and Book Collectors. studentbooktw (2007) [20] Ma, W., Zhang, H., Jin, L., Wu, S., Wang, J., Wang, Y.: Joint layout analysis, character detection and recognition for historical document digitization (2020). https://doi.org/10.48550/ARXIV.2007.06890, https://arxiv.org/abs/2007.06890 [21] Mai, J., Chen, J., Li, B., Qian, G., Elhoseiny, M., Ghanem, B.: Llm as a robotic brain: Unifying egocentric memory and control (2023) [22] Malerba, D., Ceci, M., Berardi, M.: Machine Learning for Reading Order Detection in Document Image Understanding, vol. 90, 45–69 (12 2007). https://doi.org/10.1007/978-3-540-76280-5_3 [23] Mukherjee, K., Khare, A., Verma, A.: A simple dynamic learning rate tuning algorithm for automated training of dnns (2019). https://doi.org/10.48550/ARXIV.1910.11605 [24] Naoum, A., Nothman, J., Curran, J.: Article segmentation in digitised newspapers with a 2d markov model. In: 2019 International Conference on Document Analysis and Recognition (ICDAR). 1007–1014 (2019). https://doi.org/10.1109/ICDAR.2019.00165 [25] Neisser, U.: Cognitive Psychology. Appleton-Century-Crofts, New York (1967) [26] Park, J.S., O’Brien, J.C., Cai, C.J., Morris, M.R., Liang, P., Bernstein, M.S.: Generative agents: Interactive simulacra of human behavior (2023) [27] Posner, M.: Orienting of attention. The Quarterly journal of experimental psychology 32, 3–25 (03 1980). https://doi.org/10.1080/00335558008248231 [28] Quiros, L., Vidal, E.: Learning to sort handwritten text lines in reading order through estimated binary order relations. In: 2020 25th Inter- national Conference on Pattern Recognition (ICPR). 7661–7668 (2021). https://doi.org/10.1109/ICPR48806.2021.9413256 [29] Quirós, L., Vidal, E.: Reading order detection on handwritten documents. Neural Computation and Applications 34, 9593–9611 (2022). https://doi.org/10.1007/s00521-022-06948-5 [30] Villalobos, P., Sevilla, J., Heim, L., Besiroglu, T., Hobbhahn, M., Ho, A.: Will we run out of data? an analysis of the limits of scaling datasets in machine learning(2022) [31] Walczyk, J.J.: The interplay between automatic and control processes in reading. Reading Research Quarterly 35(4), 554–566 (2000), http://www.jstor.org/stable/748099 [32] Wang, Z., Xu, Y., Cui, L., Shang, J., Wei, F.: LayoutReader: Pre-training of text and layout for reading order detection. In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 4735–4744. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic (Nov 2021). https://doi.org/10.18653/v1/2021.emnlp-main.389, https:// aclanthology.org/2021.emnlp-main.389 [33] Wei, L.: Simple Organization and Version Study of Ancient Books. Macao Library & Information Management Association, Macao (2004) [34] Xu, Y., Li, M., Cui, L., Huang, S., Wei, F., Zhou, M.: LayoutLM: Pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM (aug 2020). https://doi.org/10.1145/3394486.3403172, https://doi.org/ 10.1145%2F3394486.3403172 [35] Yang, H., Jin, L., Huang, W., Yang, Z., Lai, S., Sun, J.: Dense and tight detection of chinese characters in historical documents: Datasets and a recognition guided detector. IEEE Access 6, 30174–30183 (2018). https://doi.org/10.1109/ACCESS.2018.2840218 [36] Yu, H., Chen, J., Li, B., Xue, X.: Chinese character recognition with radicalstructured stroke trees (2022) |