STATISTICS

Viewed1145

Downloads894

Optimization of Vector Function Library Based on Domestic PuDianNao Chip
[1]Yang Zhizheng,Du Zidong,Wen Yuanbo..Optimization of Vector Function Library Based on Domestic PuDianNao Chip[J].Journal of Zhengzhou University (Engineering Science),2023,44(01):31-37.[doi:10.13705/j.issn.1671-6833.2023.01.013]
Copy
References:
[1] Intel Corporation. Intel ® oneAPI math kernel library [EB / OL] . ( 2019 - 11 - 01) [ 2022 - 02 - 10] . https: / / software. intel. com / en-us/ mkl.
[2] Intel Corporation. Intel short vector math library [ EB / OL] . ( 2021 - 06 - 20) [ 2022 - 02 - 10 ] . https: / / software. intel. com / en-us/ node / 523613.
[3] Advanced Micro Devices, Inc. . AMD core math library [EB / OL] . (2013-07-24) [2022- 02- 10] . http: / / developer. amd. com / tools-and-sdks/ archive / acml-productfeatures/ .
[4] ANAND C K, KAHL W. An optimized cell BE special function library generated by coconut[ J] . IEEE transactions on computers, 2009, 58(8) : 1126-1138. 
[5] LAUTER C. A new open-source SIMD vector libm fully implemented with high-level scalar C [ C ] / / 2016 50th Asilomar Conference on Signals, Systems and Computers. Piscataway: IEEE, 2016: 407-411.
[6] PIPARO D, INNOCENTE V, HAUTH T. Speeding up HEP experiment software with a library of fast and autovectorisable mathematical functions[ J] . Journal of physics: conference series, 2014, 513(5) : 052027. 
[7] 刘聃, 郭绍忠, 郝江伟, 等. 基于 SIMD 扩展部件的长 向量超越函数实现方法[ J] . 计算机科学, 2021, 48 (6) : 26-33. 
LIU D, GUO S Z, HAO J W, et al. Implementation of transcendental functions on vectors based on SIMD exten- 第 1 期 杨指政,等:基于国产 PuDianNao 芯片的向量函数库优化 37 sions[J]. Computer science, 2021, 48(6): 26-33.
[8] LIU D F, CHEN T S, LIU S L, et al. PuDianNao[ J] . ACM SIGPLAN notices, 2015, 50(4) : 369-381.
[9] HUCK J, MORRIS D, ROSS J, et al. Introducing the IA64 architecture[J]. IEEE micro, 2000, 20(5): 12-23. 
[10] ZHANG Y, HU Y, LI B, et al. Performance and power analysis of ATI GPU: a statistical approach [ C] / / 2011 IEEE Sixth International Conference on Networking, Architecture, and Storage. Piscataway: IEEE, 2011: 149- 158.
 [11] KUMURA T, IKEKAWA M, YOSBIDA M, et al. VLIW DSP for mobile applications[ J] . IEEE signal processing magazine, 2002, 19(4) : 10-21. 
[12] KYUNG G, JUNG C M, LEE K. An implementation of a SIMT architecture-based stream processor[C] / / TENCON 2014-2014 IEEE Region 10 Conference. Piscataway: IEEE, 2014: 1-5. 
[13] XIONG Y Q. A unified programming model for heterogeneous computing with CPU and accelerator technologies [C] / / 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) . Piscataway: IEEE, 2019: 1-4.
[14] MULLER J K. On the definition of ulp ( x ) [ R / OL ] . (2005-02-01) [ 2022- 02- 10] . https: / / www. researchgate. net / publication / 236944278_On _ the _ definition _ of _ ulpx. 
[15] Free Software Foundation. The GNU C Library ( glibc ) [EB / OL] . ( 2019 - 11 - 01) [ 2022 - 02 - 10] . https: / / www. gnu. org / software / libc / . 
[16] NVIDIA. CUDA Math API [ EB / OL] . ( 2022 - 01 - 12) [2022- 02 - 10 ] . https: / / docs. nvidia. com / cuda / cudamath-api / index. html.
Similar References:
Memo

-

Last Update: 2022-12-07
Copyright © 2023 Editorial Board of Journal of Zhengzhou University (Engineering Science)