謝謝 cykerr 大大這麼有用的材料。
今天無意間發現了一篇論文,為台灣清華大學碩士論文,但該論文未公開,所以無法取得全文。
我會想辦法去取得這份論文,並提供給大家參考。
※台灣大學、台灣交通大學、台灣清華大學、台灣成功大學,在工程領域上為台灣 Top 5 的大學。
=====
http://140.113.39.130/cgi-bin/gs/hugsweb.cgi?o=dnthucdr&i=sGH000936350.id
題名: 基於繪圖處理器之封包檢測技術研究
其他題名: A Graphics Processor-based Packet Inspection Scheme
作者: 徐獻文
Hsien-Wen Hsu
描述: 碩士
國立清華大學
資訊工程學系
GH000936350
日期: 2006
關鍵詞: 顯示卡
字串比對
網路安全
GPU
String matching
Network security
摘要: 近年來網路安全備受重視,深層封包檢測等相關技術被廣泛地研究與應用。本論文提出了一個基於繪圖處理器之深層封包檢測方案與架構 -- 利用個人電腦上之繪圖卡來作為字串比對之加速器。由於繪圖卡天生具有平行運算能力,我們利用繪圖卡中數量極多且可平行運算的Fragment Shader做為平行字串比對處理器,搭配繪圖卡的平行管線,將字串比對的效能發揮到極限。這對需求度日益增高的深層封包檢測是十分具有吸引力的,因為大部分的網路流量都是由多個連線所聚集而來的,所以本論文提出的方案與架構特別針對如何平行地處理這些網路流量有一巧妙之設計與安排。我們將熟知的自動機結構,轉化為可由繪圖卡所執行之資料結構,並透過了各種圖形處理程式語言來處理其有關狀態轉換過程的 data flow以及與系統I/O運作的control flow。因此本論文提出的方案與架構將會適用於許多以自動機為基礎的字串比對演算法,例如著名的Aho and Corasick(AC) 演算法。本論文除理論上提出了架構,並將之實現於繪圖卡之中。並透過實驗計算來分析效能表現。包括展示將自動機轉換為材質後,其在繪圖卡記憶體中的佈局,並計算出各種資料結構所需的記憶體大小外,更客觀地去分析其處理效能表現。經由實驗證明,本論文提出的方案與其他同樣利用硬體加速器的方案相比較下,我們可將目前市面上售價不到美金400元的繪圖卡,擁有高達6.4G bps的字串比對能力。另外,透過實驗本論文證明可讓一般個人電腦中大部分時間為閒置的繪圖卡充分地發揮其效用,在使用者無須再添購任何的新硬體的情況下,與其他的純軟體字串比對實作相比較,約可以減少2/3以上字串比對時間。我們相信此一研究將會成為利用繪圖卡進行網路封包處理的濫觴。
Network security has recently become increasingly important. Hence, related technologies like deep packet inspection are being extensively researched and widely implemented. This study proposes a novel scheme and architecture for packet content inspection using graphics processing units (GPUs). The proposed method takes the common component of personal computers, namely GPU, as the accelerator for pattern matching which the critical problem in deep packet inspection. With the native parallel computing power of GPUs, the multiple fragment shaders are considered as parallel pattern matching engines, and with simultaneous pipelines they maximize power of pattern matching. These features are very attractive for content signature recognition which is becoming increasingly popular. Since most network traffic is aggregated from many sessions, the proposed scheme and architecture are particularly designed for processing the multi-session network traffic simultaneously. The well-known automata structure is converted to the data structure executed in GPUs for processing the data flow, which manages the state transition procedure, and the control flow, which is in charge of system I/O, via various graphics processing languages. The proposed scheme and architecture work well with many automaton-based pattern matching algorithms. This study focuses on the most famous one, the Aho and Corasick (AC) algorithm. Besides presenting the scheme in theory and implementing on commodity GPU in practice, this study analyzes the performance of our proposed approach through evaluations, such as showing the memory fingerprint of the automaton in GPU, determining the memory size of all necessary data structures, and analyzing the throughput. The experiment results indicate that, the proposed pattern matching approach can achieve 6.4Gbps throughput powered by the graphics card with the market price less than US$400, representing an improvement on other approaches using accelerator. This study also reveals that the proposed scheme can exploit the resources of the GPU to accelerate the pattern matching processing time, since the GPU installed in common PCs is originally idle at most time periods. The processing time of the proposed software-based implementation is 2/3 less than that of other software-based pattern matching implementations. The proposed concept, Network Processing on Graphics Processing Units, is a novel research field in network processing.
參考資料: [1] R. S. Boyer and J. S. Moore, “A fast string searching algorithm,” Communications of the ACM, vol. 20, Session 10, Oct. 1977, pp. 761–772.
[2] A. V. Aho and M. J. Corasick, “Efficient string matching: An aid to bibliographic search,” Communications of the ACM, vol. 18, issue 6, Jun. 1975, pp. 333–340.
[3] S. Wu and U. Manber, “A fast algorithm for multi-pattern searching,” Technical Report TR-94-17, Department of Computer Science, University of Arizona, 1994.
[4] N. Tuck, T. Sherwood, B. Calder, G. Varghese, “Deterministic memory-efficient string matching algorithms for intrusion detection,” In Proceedings of the IEEE Infocom Conference, 2004, pp. 333–340.
[5] S. Dharmapurikar, P. Krishnamurthy, T. Sproull, J. Lockwood, “Deep packet inspection using parallel Bloom filters,” IEEE Micro, vol. 24, No. 1, 2004, pp. 52–61.
[6] J. Moscola, J. Lockwood, R. P. Loui, and M. Pachos, “Implementation of a content-scanning module for an internet firewall,” Proceedings of IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM), Napa, CA, April 9–11, 2003, pp. 31–38.
[7] F. Yu , R. H. Katz , T. V. Lakshman, “Gigabit rate packet pattern-matching using TCAM,” Proceedings of the network protocols, 12th IEEE International Conference on (ICNP’04), Oct. 5–8, 2004, pp.174–183.
[8] C. Courcoubetis and V. A. Siris, “Measurement and analysis of real network traffic,” Proceedings of the 7th Hellenic Conference on Informatics (HCI'99), Aug. 1999
[9] R.T. Liu, N.F. Huang, C.H. Chen, C.N. Kao,“A Fast String Matching Algorithm for Network Processor-based Intrusion Detection Systems," ACM Transactions on Embedded Computer Systems, Vol. 3, No. 3, Aug. 2004, pp. 614 – 633.
[10] M. Pharr and R. Fernando, “GPU Gems 2,” Addison Wesley, 2004.
[11] GPGPU: General-Purpose Computation on GPUs. http://www.gpgpu.org
[12] P. Trancoso, and M. Charalambous, “Exploring graphics processor performance for general purpose applications,” dsd, 8th Euromicro Conference on Digital System Design (DSD’05), 2005, pp. 306–313.
[13] N. K. Govindaraju, J. Gray, R. Kumar, and D. Manocha, “GPUTeraSort: High performance graphics co-processor sorting for large database management,” Proceedings of ACM SIGMOD Conference, Chicago, IL, Jun. 2006.
[14] P. Kipfer, M. Segal, and R. Westermann, “UberFlow: A GPU-based particle engine,” Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware, session: Computation, 2004, pp. 115–122.
[15] I. Rudomín, B. Hernandez, E. Millán, “Fragment shaders for agent animation using finite state machines,” In Simulation Modelling Practice and Theory Journal, Volume 13, Issue 8, Programmable Graphics Hardware November 2005, pp. 741–751 Elsevier, (preprint)
[16] D. L. Cook, J. Ioannidis, A. D. Keromytis, and J. Luck, “CryptoGraphics: Secret key cryptography using graphics cards,” Proceedings of the RSA Conference, Cryptographer's Track (CT-RSA), 2005, pp. 334–350.
[17] N. Galoppo, N. K. Govindaraju, M. Henson, and D. Manocha, “LU-GPU: Efficient algorithms for solving dense linear systems on graphics hardware,” sc, ACM/IEEE SC 2005 Conference (SC’05), 2005, pp. 3.
[18] J. D. Hall, N. A. Carr, and J. C. Hart, “Cache and bandwidth aware matrix multiplication on the GPU,” Technical Report UIUCDCS-R-2003-2328, University of Illinois, Apr. 2003.
[19] U. Kapasi, W. J. Dally, et al. “The Imagine stream processor,” In IEEE International Conference on Computer Design, Sep. 2002, pp. 282–288.
[20] DEFCON. http://cctf.shmoo.com
[21] M. Roesch, “Snort: Lightweight intrusion detection for networks,” Proceedings of the 1999 USENIX LISA Systems Administration Conference, November 1999. http://www.snort.org/
[22] Randi J. Rost, “OpenGL(R) shading language,” Addison Wesley Longman Publishing Co., Inc., Redwood City, CA, 2004
[23] I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan, “Brook for GPUs: Stream computing on graphics hardware,” ACM Transactions on Graphics (SIGGRAPH) ,Aug. 2004, pp. 777–786
[24] R.S. Wright, M. Sweet, “OpenGL SuperBible,” Waite Group Press, Indianapolis, 2000.
[25] “NVIDIA developer tools: NVShaderPerf,” http://developer.NVIDIANVIDIANVIDIA.com/object/nvshaderperf_home.html
[26] Y. H. Cho, S. Navab, and W. Mangione-Smith, “Specialized hardware for deep network packet filtering,” Proceedings of 12th International Conference on Field, vol. 2438, Sep. 2–4, 2002, pp. 452.
[27] T. Song, W. Zhang, Z. Tang, D. Wang, “Alphabet based selected character decoding for area efficient pattern matching architecture on FPGAs,” icess, Second International Conference on Embedded Software and Systems (ICESS’05), 2005, pp. 276–283.
[28] H. Bos, K. Huang, “Towards software-based signature detection for intrusion prevention on the network card,” Proceedings of Eighth International Symposium on Recent Advances in Intrusion Detection (RAID2005), Seattle, Washington, Sep. 2005.
[29] M. Norton, “Optimizing Pattern Matching for Intrusion Detection,” Jul. 2004. http://docs.idsresearch.org/OptimizingPatternMatchingForIDS.pdf.
[30] “Graphic Remedy gDEBugger,” http://www.gremedy.com, 2005
[31] L. Tan, B. Brotherton, and T. Sherwood, “Bit-split string-matching engines for intrusion detection and prevention,” ACM Transactions on Architecture and Code Optimization (TACO), Vol. 3 No. 1, Jun. 2006.
永久連結: http://nthur.lib.nthu.edu.tw/handle/987654321/35871
來源連結: http://thesis.nthu.edu.tw/cgi-bin/gs/hugsweb.cgi?o=dnthucdr&i=sGH000936350.id
顯示於類別: [資訊工程學系所] 博碩士論文