1
|
Zhou Y, Liu H, Xiang Q, Yin C. High-Performance and Flexible Design Scheme with ECC Protection in the Cache. Micromachines (Basel) 2022; 13:1931. [PMID: 36363952 PMCID: PMC9697281 DOI: 10.3390/mi13111931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/26/2022] [Accepted: 11/07/2022] [Indexed: 06/16/2023]
Abstract
To improve the reliability of static random access memory (SRAM), error-correcting codes (ECC) are typically used to protect SRAM in the cache. While improving the reliability, we also need additional circuits to support ECC, including encoding and decoding logic. In a high-speed circuit such as a CPU, the L1 cache maintains the same frequency as the CPU, and the decoding of the ECC codes in the cache consumes considerable combinational logic, resulting in limited frequency and performance. This study proposes a high-performance and flexible design scheme with ECC protection in the cache, in which the cache has two working modes: a high-performance mode and a high-reliability mode. The high-performance mode uses simple ECC codes, which can maintain high frequency with low access latency. The high-reliability mode uses more complex ECC codes, which improves the error correction capability and enhances the reliability of the SRAM. To meet the application requirements of different scenarios, the proposed scheme supports the software in switching between the above two modes by configuring the register, which improves the flexibility of the system. The results of synthesis show that the theoretical maximum frequency of proposed ECC design scheme increased from approximately 1.4 GHz in the conventional ECC design scheme to approximately 2.2 GHz. Some of the error correction capability of the high-performance mode is traded off against a 57% increase in frequency. In the high-reliability mode, the error correction capability of the SRAM is enhanced; however, the latency of accessing the cache increases by one cycle.
Collapse
|
2
|
Kuo YP, Yu YJ, Hong TP, Lai WK. Genetic Approach for Joint Transmission Grouping in Next-Generation Cellular Networks. Sensors (Basel) 2022; 22:7147. [PMID: 36236245 PMCID: PMC9571299 DOI: 10.3390/s22197147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 09/06/2022] [Accepted: 09/16/2022] [Indexed: 06/16/2023]
Abstract
Coordinated multipoint joint transmission (JT) is one of the critical downlink transmission technologies to improve network throughput. However, multiple cells in a JT group should have the same user data to transmit simultaneously, resulting in a considerable backhaul burden. Even when cells are already equipped with caches in fifth-generation networks, JT groups, without effectively utilizing the caching data, still cause unnecessary backhaul data traffic. In this article, we investigate the JT grouping problem with the consideration of caches at cells. Then, we propose a genetic approach to solve the above problem with the objective of minimizing the amount of backhaul data traffic subject to the data-rate requirement of each user. The simulation results show that our proposed generic algorithm can significantly decrease the backhaul bandwidth consumption compared to the two baselines.
Collapse
Affiliation(s)
- Yu-Po Kuo
- Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan
| | - Ya-Ju Yu
- Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan
| | - Tzung-Pei Hong
- Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan
- Department of Computer Science and Information Engineering, National University of Kaohsiung, Kaohsiung 811, Taiwan
| | - Wei-Kuang Lai
- Department of Computer Science and Engineering, National Sun Yat-sen University, Kaohsiung 804, Taiwan
| |
Collapse
|
3
|
Chi Y, Guo L, Cong J. Accelerating SSSP for Power-Law Graphs. FPGA 2022; 2022:190-200. [PMID: 35300320 PMCID: PMC8926441 DOI: 10.1145/3490422.3502358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The single-source shortest path (SSSP) problem is one of the most important and well-studied graph problems widely used in many application domains, such as road navigation, neural image reconstruction, and social network analysis. Although we have known various SSSP algorithms for decades, implementing one for large-scale power-law graphs efficiently is still highly challenging today, because ① a work-efficient SSSP algorithm requires priority-order traversal of graph data, ② the priority queue needs to be scalable both in throughput and capacity, and ③ priority-order traversal requires extensive random memory accesses on graph data. In this paper, we present SPLAG to accelerate SSSP for power-law graphs on FPGAs. SPLAG uses a coarse-grained priority queue (CGPQ) to enable high-throughput priority-order graph traversal with a large frontier. To mitigate the high-volume random accesses, SPLAG employs a customized vertex cache (CVC) to reduce off-chip memory access and improve the throughput to read and update vertex data. Experimental results on various synthetic and real-world datasets show up to a 4.9× speedup over state-of-the-art SSSP accelerators, a 2.6× speedup over 32-thread CPU running at 4.4 GHz, and a 0.9× speedup over an A100 GPU that has 4.1× power budget and 3.4× HBM bandwidth. Such a high performance would place SPLAG in the 14th position of the Graph 500 benchmark for data intensive applications (the highest using a single FPGA) with only a 45 W power budget. SPLAG is written in high-level synthesis C++ and is fully parameterized, which means it can be easily ported to various different FPGAs with different configurations. SPLAG is open-source at https://github.com/UCLA-VAST/splag.
Collapse
Affiliation(s)
- Yuze Chi
- University of California, Los Angeles
| | | | | |
Collapse
|
4
|
de Lima HP, Teseo S, de Lima RLC, Ferreira-Châline RS, Châline N. Temporary prey storage along swarm columns of army ants: an adaptive strategy for successful raiding? Biol Lett 2022; 18:20210440. [PMID: 35135318 PMCID: PMC8825983 DOI: 10.1098/rsbl.2021.0440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
While pillaging the brood of other ant colonies, Eciton army ants accumulate prey in piles, or caches, along their foraging trails. Widely documented, these structures have historically been considered as by-products of heavy traffic or aborted relocations of the ants' temporary nest, or bivouac. However, we recently observed that caches of the hook-jawed army ant, Eciton hamatum, appeared independently from heavy traffic or bivouac relocations. In addition, the flow of prey through caches varied based on the quantity of prey items workers transported. As this suggested a potential adaptive function, we developed agent-based simulations to compare raids of caching and non-caching virtual army ants. We found that caches increased the amount of prey that relatively low numbers of raiders were able to retrieve. However, this advantage became less conspicuous-and generally disappeared-as the number of raiders increased. Based on these results, we hypothesize that caches maximize the amount of prey that limited amounts of raiders can retrieve, especially as prey colonies coordinately evacuate their brood. In principle, caches also allow workers to safely collect multiple prey items and efficiently transport them to the bivouac. Further field observations are needed to test this and other hypotheses emerging from our study.
Collapse
Affiliation(s)
- Hilário Póvoas de Lima
- LEEEIS, Laboratory of Ethology, Ecology and Evolution of Insect Societies, Departamento de Psicologia Experimental, Instituto de Psicologia Experimental, Universidade de São Paulo, São Paulo, SP, Brazil,Programa de pós-graduação em Psicologia Experimental, USP, São Paulo, SP, Brazil
| | - Serafino Teseo
- School of Biological Sciences, Nanyang Technological University, Singapore
| | - Raquel Leite Castro de Lima
- LEEEIS, Laboratory of Ethology, Ecology and Evolution of Insect Societies, Departamento de Psicologia Experimental, Instituto de Psicologia Experimental, Universidade de São Paulo, São Paulo, SP, Brazil,Programa de pós-graduação em Psicologia Experimental, USP, São Paulo, SP, Brazil
| | - Ronara Souza Ferreira-Châline
- LEEEIS, Laboratory of Ethology, Ecology and Evolution of Insect Societies, Departamento de Psicologia Experimental, Instituto de Psicologia Experimental, Universidade de São Paulo, São Paulo, SP, Brazil,Programa de pós-graduação em Psicologia Experimental, USP, São Paulo, SP, Brazil
| | - Nicolas Châline
- LEEEIS, Laboratory of Ethology, Ecology and Evolution of Insect Societies, Departamento de Psicologia Experimental, Instituto de Psicologia Experimental, Universidade de São Paulo, São Paulo, SP, Brazil,Programa de pós-graduação em Psicologia Experimental, USP, São Paulo, SP, Brazil
| |
Collapse
|
5
|
Kim YS, Lee JM, Ryu JY, Ban TW. A New Cache Update Scheme Using Reinforcement Learning for Coded Video Streaming Systems. Sensors (Basel) 2021; 21:s21082867. [PMID: 33921818 PMCID: PMC8073498 DOI: 10.3390/s21082867] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 04/15/2021] [Accepted: 04/15/2021] [Indexed: 11/30/2022]
Abstract
As the demand for video streaming has been rapidly increasing recently, new technologies for improving the efficiency of video streaming have attracted much attention. In this paper, we thus investigate how to improve the efficiency of video streaming by using clients’ cache storage considering exclusive OR (XOR) coding-based video streaming where multiple different video contents can be simultaneously transmitted in one transmission as long as prerequisite conditions are satisfied, and the efficiency of video streaming can be thus significantly enhanced. We also propose a new cache update scheme using reinforcement learning. The proposed scheme uses a K-actor-critic (K-AC) network that can mitigate the disadvantage of actor-critic networks by yielding K candidate outputs and by selecting the final output with the highest value out of the K candidates. The K-AC exists in each client, and each client can train it by using only locally available information without any feedback or signaling so that the proposed cache update scheme is a completely decentralized scheme. The performance of the proposed cache update scheme was analyzed in terms of the average number of transmissions for XOR coding-based video streaming and was compared to that of conventional cache update schemes. Our numerical results show that the proposed cache update scheme can reduce the number of transmissions up to 24% when the number of videos is 100, the number of clients is 50, and the cache size is 5.
Collapse
Affiliation(s)
- Yu-Sin Kim
- Algorithm Team, Carvi, Seoul 08513, Korea;
| | - Jeong-Min Lee
- Department of Information and Communication Engineering, Gyeongsang National University, Gyeongnam 53064, Korea; (J.-M.L.); (J.-Y.R.)
| | - Jong-Yeol Ryu
- Department of Information and Communication Engineering, Gyeongsang National University, Gyeongnam 53064, Korea; (J.-M.L.); (J.-Y.R.)
| | - Tae-Won Ban
- Department of Information and Communication Engineering, Gyeongsang National University, Gyeongnam 53064, Korea; (J.-M.L.); (J.-Y.R.)
- Correspondence:
| |
Collapse
|
6
|
Cui Y, Gao F, Li W, Shi Y, Zhang H, Wen Q, Panaousis E. Cache-Based Privacy Preserving Solution for Location and Content Protection in Location-Based Services. Sensors (Basel) 2020; 20:E4651. [PMID: 32824808 DOI: 10.3390/s20164651] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 08/14/2020] [Accepted: 08/15/2020] [Indexed: 11/17/2022]
Abstract
Location-Based Services (LBSs) are playing an increasingly important role in people’s daily activities nowadays. While enjoying the convenience provided by LBSs, users may lose privacy since they report their personal information to the untrusted LBS server. Although many approaches have been proposed to preserve users’ privacy, most of them just focus on the user’s location privacy, but do not consider the query privacy. Moreover, many existing approaches rely heavily on a trusted third-party (TTP) server, which may suffer from a single point of failure. To solve the problems above, in this paper we propose a Cache-Based Privacy-Preserving (CBPP) solution for users in LBSs. Different from the previous approaches, the proposed CBPP solution protects location privacy and query privacy simultaneously, while avoiding the problem of TTP server by having users collaborating with each other in a mobile peer-to-peer (P2P) environment. In the CBPP solution, each user keeps a buffer in his mobile device (e.g., smartphone) to record service data and acts as a micro TTP server. When a user needs LBSs, he sends a query to his neighbors first to seek for an answer. The user only contacts the LBS server when he cannot obtain the required service data from his neighbors. In this way, the user reduces the number of queries sent to the LBS server. We argue that the fewer queries are submitted to the LBS server, the less the user’s privacy is exposed. To users who have to send live queries to the LBS server, we employ the l-diversity, a powerful privacy protection definition that can guarantee the user’s privacy against attackers using background knowledge, to further protect their privacy. Evaluation results show that the proposed CBPP solution can effectively protect users’ location and query privacy with a lower communication cost and better quality of service.
Collapse
|
7
|
Li J, Ranka S, Sahni S. Multicore and GPU Algorithms for Nussinov RNA Folding. IEEE Int Conf Comput Adv Bio Med Sci 2013:10.1109/ICCABS.2013.6629204. [PMID: 24385211 PMCID: PMC3876873 DOI: 10.1109/iccabs.2013.6629204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
We develop cache efficient, multicore, and GPU algorithms for RNA folding using Nussinov's equations. Our cache efficient algorithm provides a speedup between 1.6 and 3.0 relative to a naive straightforward single core code. The multicore version of the cache efficient single core algorithm provides a speedup, relative to the naive single core algorithm, between 7.5 and 14.0 on a 6 core hyperthreaded CPU. Our GPU algorithm for the NVIDIA C2050 is up to 1582 times as fast as the naive single core algorithm and between 5.1 and 11.2 times as fast as the fastest previously known GPU algorithm for Nussinov RNA folding.
Collapse
|
8
|
Armstrong N, Garland A, Burns KC. Memory for multiple cache locations and prey quantities in a food-hoarding songbird. Front Psychol 2012; 3:584. [PMID: 23293622 PMCID: PMC3533374 DOI: 10.3389/fpsyg.2012.00584] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2012] [Accepted: 12/11/2012] [Indexed: 11/13/2022] Open
Abstract
Most animals can discriminate between pairs of numbers that are each less than four without training. However, North Island robins (Petroica longipes), a food-hoarding songbird endemic to New Zealand, can discriminate between quantities of items as high as eight without training. Here we investigate whether robins are capable of other complex quantity discrimination tasks. We test whether their ability to discriminate between small quantities declines with (1) the number of cache sites containing prey rewards and (2) the length of time separating cache creation and retrieval (retention interval). Results showed that subjects generally performed above-chance expectations. They were equally able to discriminate between different combinations of prey quantities that were hidden from view in 2, 3, and 4 cache sites from between 1, 10, and 60 s. Overall results indicate that North Island robins can process complex quantity information involving more than two discrete quantities of items for up to 1 min long retention intervals without training.
Collapse
Affiliation(s)
- Nicola Armstrong
- School of Biological Sciences, Victoria University of Wellington Wellington, New Zealand
| | | | | |
Collapse
|