-
شماره ركورد
23690
-
شماره راهنما
COM3 121
-
نويسنده
لولوييان، پيمان
-
عنوان
ﺑﺮرﺳﯽ و ﻃﺮاﺣﯽ ﺳﺎﺧﺘﺎر ﻓﺸﺮدهﺳﺎزي ﺗﻘﺮﯾﺒﯽ ﺑﺮاي ﺣﺎﻓﻈﻪي ﻧﻬﺎن در ﮐﺎرﺑﺮدﻫﺎي ﺗﻘﺮﯾﺐ ﭘﺬﯾﺮ ﺑﺎ روﯾﮑﺮد زﻣﯿﻨﻪ وﯾﮋه (دادهﻫﺎي ﺗﺼﻮﯾﺮ)
-
مقطع تحصيلي
دكتري
-
رشته تحصيلي
مهندسي كامپيوتر- معماري سيستم هاي كامپيوتري
-
دانشكده
مهندسي كامپيوتر
-
تاريخ دفاع
1403/04/27
-
صفحه شمار
143 ص.
-
استاد راهنما
هومان نيك مهر
-
استاد مشاور
مهران رضائي
-
كليدواژه فارسي
راﯾﺎﻧﺶ ﺗﻘﺮﯾﺒﯽ , ﺣﺎﻓﻈﻪيﻧﻬﺎن , راﯾﺎﻧﺶ در زﻣﯿﻨﻪي وﯾﮋه , ﮐﯿﻔﯿﺖ ﺧﺪﻣﺎت , ﻓﺸﺮدهﺳﺎزي ﺗﻘﺮﯾﺒﯽ دادهﻫﺎ , ﺳﯿﻠﯿﮑﻮن ﺗﺎرﯾﮏ
-
چكيده فارسي
ﺷﻤﺎري از ﻓﻨﺎوريﻫﺎﯾﯽ ﮐﻪ ﻣﻮﺟﺐ رﺷﺪ ﻧﻤﺎﯾﯽ ﮐﺎراﯾﯽ ﺳﺎﻣﺎﻧﻪﻫﺎي راﯾﺎﻧﻪاي در دﻫﻪﻫﺎي ﮔﺬﺷﺘﻪ ﮔﺮدﯾﺪهاﻧﺪ ﺑﺎ ﺳﺪﻫﺎي ﻓﯿﺰﯾﮑﯽ ﻣﻮاﺟﻪ ﺷﺪهاﻧﺪ. از ﺳﻮي دﯾﮕﺮ، ﺑﺴﯿﺎري از ﮐﺎرﺑﺮدﻫﺎي اﻣﺮوزي ﻧﻈﯿﺮ ﭘﺮدازش ﺳﯿﮕﻨﺎل و ﺗﺼﻮﯾﺮ، ﯾﺎدﮔﯿﺮي ﻣﺎﺷﯿﻦ، ﺗﺤﻠﯿﻞداده، ﺑﺎزيﻫﺎ و ﮐﺎرﺑﺮدﻫﺎي ﻫﻮشﻣﺼﻨﻮﻋﯽ، ﻧﯿﺎزي ﺑﻪ راﯾﺎﻧﺶ دﻗﯿﻖ ﻧﺪاﺷﺘﻪ ﯾﺎ ﺑﻪ ﺣﺪ ﻣﺸﺨﺼﯽ از دﻗﺖ راﯾﺎﻧﺸﯽ ﻧﯿﺎزﻣﻨﺪﻧﺪ. در ﻧﺘﯿﺠﻪ، ﭘﺪﯾﺪهي راﯾﺎﻧﺶ ﺗﻘﺮﯾﺒﯽ ﺑﻪﻋﻨﻮان راﻫﮑﺎري ﺑﺮاي ﮐﺎﻫﺶ اﺛﺮات ﺳﯿﻠﯿﮑﻮن ﺗﺎرﯾﮏ ﻣﺪ ﻧﻈﺮ ﻗﺮار ﮔﺮﻓﺘﻪ اﺳﺖ. ﻫﻤﭽﻨﯿﻦ روﯾﮑﺮد زﻣﯿﻨﻪ وﯾﮋه ﺑﻪ ﻃﺮاﺣﯽ ﺳﺎﻣﺎﻧﻪﻫﺎي دﯾﺠﯿﺘﺎل ﻧﯿﺰ ﻣﯽﺗﻮاﻧﺪ ﻧﻘﺶ راﻫﮑﺎر ﺑﺮاي ﻣﻘﺎﺑﻠﻪ ﺑﺎ ﻣﺤﺪوﯾﺖﻫﺎي ﻋﻤﻠﯽ ﺑﻮﺟﻮد آﻣﺪه ﺑﺮاي ﻃﺮاﺣﯽ ﺳﺎﻣﺎﻧﻪﻫﺎي راﯾﺎﻧﺸﯽ را ﺑﺎزي ﮐﻨﺪ. ازﺳﻮيدﯾﮕﺮ، ﺳﻠﺴﻠﻪﻣﺮاﺗﺐ ﺣﺎﻓﻈﻪ ﺑﻪوﯾﮋه ﺣﺎﻓﻈﻪيﻧﻬﺎن ﻧﻘﺶ ﮐﻠﯿﺪي در ﺳﺎﻣﺎﻧﻪﻫﺎي راﯾﺎﻧﺸﯽ دارد. ﻋﻼوهﺑﺮ اﯾﻦ، ﻓﺸﺮدهﺳﺎزي ﺣﺎﻓﻈﻪيﻧﻬﺎن را ﻣﯽﺗﻮان ﺑﻪ ﺻﻮرت ﮐﺎﻫﺶ اﻧﺪازه دادهي ﺣﺎﻓﻈﻪيﻧﻬﺎن ﺑﻪ ﻃﻮري ﮐﻪ ﺳﺎﻣﺎﻧﻪي راﯾﺎﻧﺸﯽ از ﻣﺰاﯾﺎي اﻓﺰاﯾﺶ ﻇﺮﻓﯿﺖ، ﺑﺪون اﺑﺘﻼ ﺑﻪ اﺛﺮات ﻣﻨﻔﯽ ﻓﯿﺰﯾﮑﯽ ﯾﮏ ﺣﺎﻓﻈﻪي ﺑﺰرﮔﺘﺮ ﺗﻌﺮﯾﻒ ﮐﺮد، ﮐﻪ ﺑﺮ ﮐﯿﻔﯿﺖ ﺣﺎﻓﻈﻪيﻧﻬﺎن ﻣﯽﺗﻮاﻧﺪ ﻣﻮﺛﺮ ﺑﺎﺷﺪ. ﺑﺎ ﺗﻮﺟﻪ ﺑﻪ ﻣﻮارد ﻣﺬﮐﻮر، ﺑﺮرﺳﯽ و ﻃﺮاﺣﯽ ﺳﺎﺧﺘﺎر ﻓﺸﺮدهﺳﺎزي ﺗﻘﺮﯾﺒﯽ ﺑﺮاي ﺣﺎﻓﻈﻪيﻧﻬﺎن درﮐﺎرﺑﺮدﻫﺎي ﺗﻘﺮﯾﺐﭘﺬﯾﺮ
ﺑﺎ روﯾﮑﺮد زﻣﯿﻨﻪي وﯾﮋه )دادهﻫﺎي ﺗﺼﻮﯾﺮ( ﺑﻪﻋﻨﻮان ﻣﺴﺌﻠﻪي اﯾﻦ ﭘﮋوﻫﺶ ﻗﺮار ﮔﺮﻓﺘﻪ اﺳﺖ. ﭼﺎﻟﺶ اﺻﻠﯽ، ﺑﻬﺒﻮد ﺷﻨﺎﺳﺎﯾﯽ اﻟﮕﻮﻫﺎ ﻫﻤﺮاه ﺑﺎ ﺑﺮآوردهﺷﺪن ﻣﻮارد زﯾﺮ اﺳﺖ: ﺳﺮﻋﺖ ﻓﺸﺮدهﺳﺎزي ﺑﺎﻻ، ﺳﺎﺧﺘﺎر ﺳﺎده، ﻧﺮخ ﻓﺸﺮدهﺳﺎزي ﺑﺎﻻ و ﺣﻔﻆ ﮐﯿﻔﯿﺖ ﻣﻄﻠﻮب. روشﻫﺎي ﭘﺎﯾﻪي راﯾﺞ ﻣﻮرد اﺳﺘﻔﺎده در اﻟﮕﻮرﯾﺘﻢﻫﺎي ﻓﺸﺮدهﺳﺎزي
ﺑﺎ اﺳﺘﻔﺎده از روش ﺗﻘﺮﯾﺐ ﺑﺎ ﺑﺮش ﺑﻬﺒﻮدﯾﺎﻓﺘﻪ و در ﻗﺎﻟﺐ ﯾﮏ اﻟﮕﻮرﯾﺘﻢ ﻓﺸﺮدهﺳﺎزي ﺳﺨﺖاﻓﺰاري ﺳﺎده، ﺑﺎ ﻧﺮخ ﻓﺸﺮدهﺳﺎز ﺑﺎﻻ و ﺳﺮﯾﻊ 1) ﭼﺮﺧﻪ ﺳﺎﻋﺖ واﻓﺸﺮدهﺳﺎزي و 2 ﭼﺮﺧﻪ ﺳﺎﻋﺖ ﻓﺸﺮدهﺳﺎزي( ﺑﻪ ﻧﺎم
ﻓﺸﺮدهﺳﺎزي ﺗﻘﺮﯾﺒﯽ ﺑﻠﻮكﻫﺎي ﺗﺼﻮﯾﺮ، ﭘﯿﺎدهﺳﺎزي ﺷﺪه اﺳﺖ. اﻟﮕﻮرﯾﺘﻢ ﻣﻮرد ﻧﻈﺮ از دﯾﺪﮔﺎه ﺗﺤﻠﯿﻠﯽ، ﺗﺨﻤﯿﻦ ﭘﯿﭽﯿﺪﮔﯽ زﻣﺎﻧﯽ و ﻣﻨﻄﻘﯽ، ﺷﺒﯿﻪﺳﺎزي ﻋﻤﻠﮑﺮدي و ﺷﺒﯿﻪﺳﺎزي و ﺳﻨﺘﺰ ﺳﺨﺖاﻓﺰاري ﻣﻮرد ﺑﺮرﺳﯽ ﻗﺮار ﮔﺮﻓﺘﻪ و ﺑﺎ ﯾﮑﯽ از ﺳﺮﯾﻊﺗﺮﯾﻦ اﻟﮕﻮرﯾﺘﻢﻫﺎ ﺑﺎ ﻗﺎﺑﻠﯿﺖ ﭼﻨﺪ-ﻓﺸﺮدهﺳﺎزي و ﻧﺴﺨﻪي ﺗﻘﺮﯾﺒﯽ آن ﺳﻨﺠﯿﺪه ﺷﺪه اﺳﺖ. ﻧﺘﺎﯾﺞ ارزﯾﺎﺑﯽﻫﺎ اﻓﺰاﯾﺶ )ﻣﯿﺎﻧﮕﯿﻦ( 25/7 )ﺗﺎ (106 ﺑﺮاﺑﺮي در ﻧﺮخ ﻓﺸﺮدهﺳﺎزي ﺑﻠﻮك را ﻧﺴﺒﺖ ﺑﻪ ﻣﺒﻨﺎي ﻣﻘﺎﯾﺴﻪ و )ﻣﯿﺎﻧﮕﯿﻦ( 2/69 )ﺗﺎ (45/0 ﺑﺮاﺑﺮي ﻧﺴﺒﺖ ﺑﻪ ﻧﺴﺨﻪي ﺗﻘﺮﯾﺒﯽ آن را ﻧﺸﺎن ﻣﯽدﻫﺪ. اﯾﻦ ﻣﺰاﯾﺎي ﻓﺸﺮدهﺳﺎزي ﺗﻨﻬﺎ ﺑﻪ ﻃﻮر ﻣﯿﺎﻧﮕﯿﻦ 2/73درﺻﺪ ﺧﻄﺎ در ﮐﯿﻔﯿﺖ ﯾﮏ ﮐﺎرﺑﺮد ﺷﻨﺎﺳﺎﯾﯽ ﺗﺼﻮﯾﺮ ﻣﺒﺘﻨﯽ ﺑﺮ ﯾﺎدﮔﯿﺮيﻋﻤﯿﻖ از ﺧﻮد ﻧﺸﺎن ﻣﯽدﻫﺪ. ﻫﻤﭽﻨﯿﻦ ﺑﺮاي ﺗﺼﺎوﯾﺮ ﻣﻨﻔﺮد، ﻣﯿﺰان ﻣﺘﻮﺳﻂ اوج ﻧﺴﺒﺖ ﺳﯿﮕﻨﺎل ﺑﻪ ﻧﻮﯾﺰ 39/49دﺳﯽﺑﻞ ﻣﺸﺎﻫﺪه ﺷﺪه اﺳﺖ. ﮐﯿﻔﯿﺎت ﺑﺪﺳﺖ آﻣﺪه ﺗﻨﻬﺎ ﺑﺎ 13درﺻﺪ ﺳﺮﺑﺎر ﺣﺎﻓﻈﻪ ﻫﻤﺮاه ﺑﻮده اﺳﺖ. ﻫﻤﭽﻨﯿﻦ ﻣﺒﺎﺣﺚ ﺑﺮرﺳﯽﺷﺪه از دﯾﺪﮔﺎه ﻧﻈﺮﯾﻪي اﻃﻼﻋﺎت ﺑﺮرﺳﯽ و اﺛﺮ ﺗﻘﺮﯾﺐ ﺑﺮ اﻟﮕﻮرﯾﺘﻢ ﻓﺸﺮدهﺳﺎزي ﻫﺎﻓﻤﻦ ﮐﻪ ﻣﻮرد اﺳﺘﻔﺎده در ﺑﺮﺧﯽ اﻟﮕﻮرﯾﺘﻢﻫﺎي ﻓﺸﺮدهﺳﺎزي ﺣﺎﻓﻈﻪيﻧﻬﺎن اﺳﺖ ﺑﺮاي زﻣﯿﻨﻪي وﯾﮋهي ﭘﺮدازش ﺑﺮرﺳﯽ ﺷﺪه اﺳﺖ. ﻧﺘﺎﯾﺞ ﺑﻪ ﺻﻮرت ﺗﺤﻠﯿﻠﯽ و ﻫﻤﭽﻨﯿﻦ ﺷﺒﯿﻪﺳﺎزي ﻧﺸﺎن ﻣﯽدﻫﺪ روش ﺗﻘﺮﯾﺐ ﻣﻮرد اﺳﺘﻔﺎده، ﮐﺪﻫﺎي ﻫﺎﻓﻤﻦ ﮐﻮﭼﮏﺗﺮ 2/18) ﺑﺮاﺑﺮ( ﺗﻮﻟﯿﺪ ﮐﺮده ﮐﻪ ﻣﻮﺟﺐ اﻓﺰاﯾﺶ ﻧﺮخ ﻓﺸﺮدهﺳﺎزي ﺑﻠﻮك ﺗﺎ 2/48 ﺑﺮاﺑﺮ ﻣﯽﮔﺮدد
-
كليدواژه لاتين
Approximate Computing , Cache Memory , Domain-specific Computing , Quality of Service , Approximate Data Compression , Dark Silicon
-
عنوان لاتين
Exploring and Design of Approximate Cache Compression Structure for Domain-specific Approximate Applications (Image Data)
-
گروه آموزشي
مهندسي معماري كامپيوتر
-
چكيده لاتين
Some driving technologies of the digital system design in the past dacades have faced a number of physical limits. Many modern applications like image and sig- nal processing, machine learning, data analytics, computer games, and AI do not really require full precision computing. Therefore, Approximate computing is con- sidered as a promising solution to cope with the dark silicon. Also, the emerging domain-specific approach to computer architecture can play the role in coping with practical limits of computing systems design. On the other hand, memory hierar- chy, especially cache memory plays the key role in computing systems. In addition, the cache compression can be defined as reducing the cache data size to achieve the benefits of a larger cache memory without suffering from the major drawbacks of a physically large cache architecture. According to the points mentioned, we address approximate cache compression for domain-specific (image data). The challenge of the design is the improvment of detecting patterns while satisfying high decompres- sion speed, simple structure, high compression ratio and output degradation within an acceptable range. We look at the problem from heuristic, analytic and physical implementation perspectives through analysis and circuit and system-level simula- tions. We improve the mainstream block compression approaches through approxi- mation and introduce the Image Approximate Block Compressor (IABC) algorithm; a simple, high block compression ratio and fast (one-cycle and two-cycles latency for decompression and compression, respectively) hardware compression algorithm. The porposed algorithm is examined through analysis, theoretical estimate of logic and time complexity, functional simualtion and HW synthesis and simultaion. The results are compared against one of the fastest multi-compressor algorithms and its approximate version, as our baselines. The evaluations reveal compression ratio of
25.7 on average (up to 106) against the baseline with an average ratio of 2.69 (up to 45.0) and the baseline approximate version with an average ratio of 2.7 (up to 45.2). The evaluation results also show that the compression benefits of IABC come at only 2.73% average error in the quality of a deep learning object recognition ap- plication. In addition, IABC generates high-quality outputs for stand-alone images with a 39.49dB average Peak Signal to Noise Ratio The mentioned qualities come at only 13% storage overhead. In addition, we look at the problem from the informa- tion theory perspective and show the effect of approximation on huffman encoding (which is used in some cache compression algorithms) targeting image data. The analysis and simulations show smaller huffman code lengths (2.18x) which leads to
up to 2.48x block compression ratio.
-
تعداد فصل ها
6
-
لينک به اين مدرک :