Reference: | [1] D. BAHDANAU, K. H. CHO, AND Y. BENGIO, Neural machine translation by jointly learning to align and translate, in 3rd International Conference on Learning Representations, ICLR 2015, 2015. [2] S. BELCIUG AND F. GORUNESCU, Learning a single-hidden layer feed forward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection, Journal of biomedical informatics, 83 (2018), pp. 159–166. [3] Y. BENGIO, Learning deep architectures for AI, Now Publishers Inc, 2009. [4] J.-D. BOISSONNAT AND A. GHOSH, Manifold reconstruction using tangential delaunay complexes, Discrete & Computational Geometry, 51 (2014), pp. 221–267. [5] J.-D. BOISSONNAT, L. J. GUIBAS, AND S. Y. OUDOT, Manifold reconstruction in arbitrary dimensions using witness complexes, Discrete & Computational Geometry, 42 (2009), pp. 37–70. [6] C. BREGLER AND S. M. OMOHUNDRO, Nonlinear manifold learning for visual speech recognition, in Proceedings of IEEE International Conference on Computer Vision, IEEE, 1995, pp. 494–499. [7] H. BREZIS, Functional analysis, Sobolev spaces and partial differential equations, Springer Science & Business Media, 2010. [8] T. B. BROWN, B. MANN, N. RYDER, M. SUBBIAH, J. KAPLAN, P. DHARIWAL, A. NEELAKANTAN, P. SHYAM, G. SASTRY, A. ASKELL, ET AL., Language models are few-shot learners, arXiv preprint arXiv:2005.14165, (2020). [9] D. BURAGO, I. D. BURAGO, Y. BURAGO, S. IVANOV, AND S. A. IVANOV, A course in metric geometry, vol. 33, American Mathematical Soc., 2001. [10] J. CAO AND Z. LIN, Extreme learning machines on high dimensional and large data applications: a survey, Mathematical Problems in Engineering, (2015). [11] I. CARAGIANNIS, G. A. KRIMPAS, AND A. A. VOUDOURIS, Aggregating partial rankings with applications to peer grading in massive online open courses, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015, pp. 675–683. [12] M. P. D. CARMO, Riemannian geometry, Birkhäuser, 1992. [13] T. Q. CHEN, Y. RUBANOVA, J. BETTENCOURT, AND D. K. DUVENAUD, Neural ordinary differential equations, in Advances in Neural Information Processing Systems, 2018, pp. 6572–6583. [14] S.-W. CHENG, T. K. DEY, AND E. A. RAMOS, Manifold reconstruction from point samples., in SODA, vol. 5, 2005, pp. 1018–1027. [15] K. CHO, B. VAN MERRIENBOER, Ç. GULYEHRE, D. BAHDANAU, F. BOUGARES, H. SCHWENK, AND Y. BENGIO, Learning phrase representations using rnn encoder-decoder for statistical machine translation, in EMNLP, 2014. [16] F. R. CHUNG AND F. C. GRAHAM, Spectral graph theory, no. 92, American Mathematical Soc., 1997. [17] J. C. COX, J. E. INGERSOLL JR, AND S. A. ROSS, A theory of the term structure of interest rates, in Theory of valuation, World Scientific, 2005, pp. 129–164. [18] M. A. COX AND T. F. COX, Multidimensional scaling, in Handbook of data visualization, Springer, 2008, pp. 315–347. [19] G. CRAWFORD, The geometric mean procedure for estimating the scale of a judgement matrix, Mathematical Modelling, 9 (1987), pp. 327–334. [20] G. CYBENKO, Approximation by superpositions of a sigmoidal function, Mathematics of control, signals and systems, 2 (1989), pp. 303–314. [21] L. DENG, G. HINTON, AND B. KINGSBURY, New types of deep neural network learning for speech recognition and related applications: An overview, in 2013 IEEE international conference on acoustics, speech and signal processing, IEEE, 2013, pp. 8599–8603. [22] J. DETEMPLE AND Y. KITAPBAYEV, On American VIX options under the generalized 3/2 and 1/2 models, Mathematical Finance, 28 (2018), pp. 550–581. [23] J. DETEMPLE AND C. OSAKWE, The valuation of volatility options, Review of Finance, 4 (2000), pp. 21–50. [24] F. DEUTSCH, The angle between subspaces of a Hilbert space, in Approximation theory, wavelets and applications, Springer, 1995, pp. 107–130. [25] J. DEVLIN, M.-W. CHANG, K. LEE, AND K. TOUTANOVA, BERT: Pre-training of deep bidirectional transformers for language understanding, in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 2019, pp. 4171–4186. [26] M. DRAZIN, Pseudo-inverses in associative rings and semigroups, The American mathematical monthly, 65 (1958), pp. 506–514. [27] B. ERIKSSON, Learning to top-k search using pairwise comparisons, in Artificial Intelligence and Statistics, PMLR, 2013, pp. 265–273. [28] C. FARABET, C. COUPRIE, L. NAJMAN, AND Y. LECUN, Learning hierarchical features for scene labeling, IEEE transactions on pattern analysis and machine intelligence, 35 (2012), pp. 1915–1929. [29] C. FEFFERMAN, S. IVANOV, Y. KURYLEV, M. LASSAS, AND H. NARAYANAN, Fitting a putative manifold to noisy data, in Proceedings of COLT 2018 (Conference On Learning Theory), 2018. [30] -, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Foundations of Computational Mathematics, (2019), pp. 1–99. [31] C. FEFFERMAN, S. IVANOV, M. LASSAS, AND H. NARAYANAN, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM Journal on Mathematics of Data Science, 2 (2020), pp. 770–808. [32] W. FELLER, Two singular diffusion problems, Annals of mathematics, (1951), pp. 173–182. [33] R. A. FISHER, The use of multiple measurements in taxonomic problems, Annals of eugenics, 7 (1936), pp. 179–188. [34] G. B. FOLLAND, Real analysis: modern techniques and their applications, vol. 40, John Wiley & Sons, 1999. [35] S. FREEMAN AND J. W. PARKS, How accurate is peer grading?, CBE-Life Sciences Education, 9 (2010), pp. 482–488. [36] F. A. GERS, J. SCHMIDHUBER, AND F. CUMMINS, Learning to forget: Continual prediction with LSTM, (1999). [37] J. GOARD AND M. MAZUR, Stochastic volatility models and the pricing of VIX options, Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics, 23 (2013), pp. 439–458. [38] I. GOODFELLOW, Y. BENGIO, A. COURVILLE, AND Y. BENGIO, Deep learning, vol. 1, MIT press Cambridge, 2016. [39] A. GRAVES, A.-R. MOHAMED, AND G. HINTON, Speech recognition with deep recurrent neural networks, in 2013 IEEE international conference on acoustics, speech and signal processing, Ieee, 2013, pp. 6645–6649. [40] A. GRUNBICHLER AND F. A. LONGSTAFF, Valuing futures and options on volatility, Journal of Banking & Finance, 20 (1996), pp. 985–1001. [41] E. HABER AND L. RUTHOTTO, Stable architectures for deep neural networks, Inverse Problems, 34 (2017), p. 014004. [42] B. HANIN AND M. SELLKE, Approximating continuous functions by relu nets of minimal width, arXiv preprint arXiv:1710.11278, (2017). [43] K. HE, X. ZHANG, S. REN, AND J. SUN, Deep residual learning for image recognition, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778. [44] M. HERBSTER, M. PONTIL, AND L. WAINER, Online learning over graphs, in Proceedings of the 22nd international conference on Machine learning, 2005, pp. 305–312. [45] S. HOCHREITER AND J. SCHMIDHUBER, Long short-term memory, Neural computation, 9 (1997), pp. 1735–1780. [46] L. HOGBEN, Spectral graph theory and the inverse eigenvalue problem of a graph, The Electronic Journal of Linear Algebra, 14 (2005). [47] M. HOHENWARTER, Geogebra–ein softwaresystem für dynamische geometrie und algebra der ebene [geogebra–a software system for dynamic geometry and algebra in the plane] (unpublished master’s thesis), Universität Salzburg, Salzburg, Austria, (2002). [48] J. HOWARD AND S. RUDER, Universal language model fine-tuning for text classification, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2018, pp. 328–339. [49] G.-B. HUANG, Q.-Y. ZHU, AND C.-K. SIEW, Extreme learning machine: a new learning scheme of feedforward neural networks, in 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541), vol. 2, Ieee, 2004, pp. 985–990. [50] X. HUO, X. S. NI, AND A. K. SMITH, A survey of manifold-based learning methods, Recent advances in data mining of enterprise data, (2007), pp. 691–745. [51] X. JIANG, L.-H. LIM, Y. YAO, AND Y. YE, Statistical ranking and combinatorial Hodge theory, Mathematical Programming, 127 (2011), pp. 203–244. [52] D. P. KINGMA AND J. BA, Adam: A method for stochastic optimization, in ICLR (Poster), 2015. [53] D. B. KOTLOW, A free boundary problem connected with the optimal stopping problem for diffusion processes, Transactions of the American Mathematical Society, 184 (1973), pp. 457–478. [54] A. KRIZHEVSKY, I. SUTSKEVER, AND G. E. HINTON, ImageNet classification with deep convolutional neural networks, in Advances in neural information processing systems, 2012, pp. 1097–1105. [55] Y. LECUN, Y. BENGIO, AND G. HINTON, Deep learning, nature, 521 (2015), pp. 436–444. [56] Y. LECUN, L. BOTTOU, Y. BENGIO, AND P. HAFFNER, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 86 (1998), pp. 2278–2324. [57] D. LEVIN, The approximation power of moving least-squares, Mathematics of computation, 67 (1998), pp. 1517–1531. [58] H. LIN AND S. JEGELKA, Resnet with one-neuron hidden layers is a universal approximator, Advances in neural information processing systems, 31 (2018), pp. 6169–6178. [59] T. LIN AND H. ZHA, Riemannian manifold learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 30 (2008), pp. 796–809. [60] T.-Y. LIN AND Y.-L. TSAI, An application of hodgerank to online peer assessment, in Proceedings of the 2017 SIAM Workshop on Machine Learning for Recommender Systems (MLRec), 2018. [61] Y. LIPMAN, D. COHEN-OR, D. LEVIN, AND H. TAL-EZER, Parameterization-free projection for geometry reconstruction, ACM Transactions on Graphics (TOG), 26 (2007), pp. 22–es. [62] H.-K. LIU, Properties of American volatility options in the mean-reverting 3/2 volatility model, SIAM Journal on Financial Mathematics, 6 (2015), pp. 53–65. [63] H.-K. LIU, T.-Y. LIN, AND Y.-L. TSAI, On the pricing formula for the perpetual American volatility option under the mean-reverting processes, Taiwanese Journal of Mathematics, (2020). [64] T.-Y. LIU, Learning to rank for information retrieval, (2011). [65] Y. LU, A. ZHONG, Q. LI, AND B. DONG, Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations, in International Conference on Machine Learning, PMLR, 2018, pp. 3276–3285. [66] Z. LU, H. PU, F. WANG, Z. HU, AND L. WANG, The expressive power of neural networks: A view from the width, Advances in neural information processing systems, 30 (2017), pp. 6231–6239. [67] A. L. MAAS, A. Y. HANNUN, AND A. Y. NG, Rectifier nonlinearities improve neural network acoustic models, in ICML Workshop on Deep Learning for Audio, Speech and Language Processing, 2013. [68] G. A. MILLER, The magical number seven, plus or minus two: Some limits on our capacity for processing information., Psychological review, 63 (1956), p. 81. [69] J. W. Milnor and j. D. Stasheff, Characteristic Classes, no. 76, Princeton University Press, 1974. [70] V. MNIH, K. KAVUKCUOGLU, D. SILVER, A. GRAVES, I. ANTONOGLOU, D. WIERSTRA, AND M. RIEDMILLER, Playing Atari with deep reinforcement learning, (2013). cite arxiv:1312.5602Comment: NIPS Deep Learning Workshop 2013. [71] Y. NESTEROV, A method for unconstrained convex minimization problem with the rate of convergence o (1/kˆ 2), in Doklady an ussr, vol. 269, 1983, pp. 543–547. [72] B. NEYSHABUR, S. BHOJANAPALLI, D. MCALLESTER, AND N. SREBRO, Exploring generalization in deep learning, in Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017, pp. 5949–5958. [73] P. NIYOGI, S. SMALE, AND S. WEINBERGER, Finding the homology of submanifolds with high confidence from random samples, Discrete & Computational Geometry, 39 (2008), pp. 419–441. [74] Y. PARMAR, S. NATARAJAN, AND G. SOBHA, Deeprange: deep-learning-based object detection and ranging in autonomous driving, IET Intelligent Transport Systems, 13 (2019), pp. 1256–1264. [75] M. PETERS, M. NEUMANN, M. IYYER, M. GARDNER, C. CLARK, K. LEE, AND L. ZETTLEMOYER, Deep contextualized word representations, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2018, pp. 2227–2237. [76] G. PHILIPP, D. SONG, AND J. G. CARBONELL, Gradients explode-deep networks are shallow-resnet explained, (2018). [77] R. PIZIAK AND P. ODELL, Affine projections, Computers & Mathematics with Applications, 48 (2004), pp. 177–190. [78] R. PLESS AND R. SOUVENIR, A survey of manifold learning for images, IPSJ Transactions on Computer Vision and Applications, 1 (2009), pp. 83–94. [79] L. RUTHOTTO AND E. HABER, Deep neural networks motivated by partial differential equations, Journal of Mathematical Imaging and Vision, (2019), pp. 1–13. [80] R. W. SAATY, The analytic hierarchy process-what it is and how it is used, Mathematical modelling, 9 (1987), pp. 161–176. [81] M. SAERENS, F. FOUSS, L. YEN, AND P. DUPONT, The principal components analysis of a graph, and its relationships to spectral clustering, in European conference on machine learning, Springer, 2004, pp. 371–383. [82] F. SCARSELLI, M. GORI, A. C. TSOI, M. HAGENBUCHNER, AND G. MONFARDINI, The graph neural network model, IEEE Transactions on Neural Networks, 20 (2008), pp. 61–80. [83] R. K. SIZEMORE, HodgeRank: Applying combinatorial Hodge theory to sports ranking, PhD thesis, Wake Forest University, 2013. [84] B. SOBER AND D. LEVIN, Manifold approximation by moving least-squares projection (MMLS), Constructive Approximation, (2019), pp. 1–46. [85] G. STEWART, On the continuity of the generalized inverse, SIAM Journal on Applied Mathematics, 17 (1969), pp. 33–45. [86] Y. TAY, M. DEHGHANI, D. BAHRI, AND D. METZLER, Efficient transformers: A survey, arXiv preprint arXiv:2009.06732, (2020). [87] N. M. TRAN, Pairwise ranking: choice of method can produce arbitrarily different rank order, Linear Algebra and its Applications, 438 (2013), pp. 1012–1024. [88] G. VAN ROSSUM AND F. L. DRAKE JR, Python tutorial, vol. 620, Centrum voor Wiskunde en Informatica Amsterdam, 1995. [89] A. VASWANI, N. SHAZEER, N. PARMAR, J. USZKOREIT, L. JONES, A. N. GOMEZ, L. KAISER, AND I. POLOSUKHIN, Attention is all you need, Advances in neural information processing systems, 30 (2017), pp. 5998–6008. [90] A. VEIT, M. J. WILBER, AND S. BELONGIE, Residual networks behave like ensembles of relatively shallow networks, Advances in neural information processing systems, 29 (2016), pp. 550–558. [91] P. VELICKOVIC, G. CUCURULL, A. CASANOVA, A. ROMERO, P. LIO, AND Y. BENGIO, Graph attention networks, in International Conference on Learning Representations, 2018. [92] V. P. VISHWAKARMA AND M. GUPTA, A new learning algorithm for single hidden layer feedforward neural networks, International Journal of Computer Applications, 28 (2011), pp. 26–33. [93] T. WALSH, The PeerRank method for peer assessment, in Proceedings of the Twenty-first European Conference on Artificial Intelligence, 2014, pp. 909–914. [94] T.-W. WENG, H. ZHANG, P.-Y. CHEN, J. YI, D. SU, Y. GAO, C.-J. HSIEH, AND L. DANIEL, valuating the robustness of neural networks: An extreme value theory approach, in International Conference on Learning Representations, 2018. [95] D. B. WEST, Introduction to graph theory, vol. 2, Prentice hall Upper Saddle River, 2001. [96] H. WHITNEY, Differentiable manifolds, Annals of Mathematics, (1936), pp. 645–680. [97] S. WOLD, K. ESBENSEN, AND P. GELADI, Principal component analysis, Chemometrics and intelligent laboratory systems, 2 (1987), pp. 37–52. [98] Z. WU, S. PAN, F. CHEN, G. LONG, C. ZHANG, AND S. Y. PHILIP, A comprehensive survey on graph neural networks, IEEE Transactions on Neural Networks and Learning Systems, (2020). [99] S. XIE, R. GIRSHICK, P. DOLLAR, Z. TU, AND K. HE, Aggregated residual transformations for deep neural networks, in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1492–1500. [100] K. XU, W. HU, J. LESKOVEC, AND S. JEGELKA, How powerful are graph neural networks?, in International Conference on Learning Representations, 2018. |