INKER Research Publications





[1.4 MB]
Jesse G. Beu, Jason A. Poovey, Eric R. Hein, Thomas M. Conte, “High-Speed Formal Verification of Heterogeneous Coherence Hierarchies,” The 19th IEEE International Symposium on High Performance Computer Architecture (HPCA'13), (Shenzhen, China), Feb., 2013.

[1.8 MB | 3.5 MB]
R. A. Bheda, J. G. Beu, B. P. Railing, and T. M. Conte, “Extrapolation Pitfalls When Evaluating Limited Endurance Memory,” Proceedings of the IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'12), (Washington, D.C), Aug., 2012.

[5.3 MB | 4.5 MB]
P. D. Bryan, J. A. Poovey, J. G. Beu and T. M. Conte, “Accelerating Multi-threaded Application Simulation Through Barrier-Interval Time-Parallelism,” Proceedings of the IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS'12), (Washington, D.C), Aug., 2012.

[1.9 MB | 1.1 MB]
J. G. Beu, M. C. Rosier and T. M. Conte, “Manager-Client Pairing: A Framework for Implementing Coherence Hierarchies,” Proceedings of the 44th Annual International Symposium on Microarchitecture (MICRO-44), (Porto Alegre, Brazil), Dec., 2011.

[813 KB]
R. A. Bheda, J. A. Poovey, J. G. Beu and T.M. Conte, "Energy Efficient Phase Change Memory Based Main Memory for Future High Performance Systems," Proceedings of the 2nd International Green Computing Conference (IGCC'11), Orlando, Florida, July 25-28, 2011.

[454 KB]
J. A. Poovey, B. P. Railing and T. M. Conte, “Parallel pattern detection for architectural improvements,” Proceedings of the 3rd USENIX Workshop on Hot Topics in Parallelism (HotPar), Berkeley, CA, May 26–27, 2011.

[1706 KB]
J. A. Poovey, M. C. Rosier, T. M. Conte "Pattern-Aware Dynamic Thread Mapping Mechanisms for Asymmetric Manycore Architectures," Technical Report No. 2011-1, School of Computer Science, Georgia Institute of Technology, 2011

[1.6 MB]
J. A. Poovey, T. M. Conte, M. Levy, S. Gal-On, "A Benchmark Characterization of the EEMBC Benchmark Suite," Micro, IEEE , vol.29, no.5, pp.18-29, Sept.-Oct. 2009

[1207 KB]
B. V. Iyer "Length Adaptive Processors: A Solution for Energy/Performance Dilemma in Embedded Systems," Ph.D. Thesis, Department of Electrical and Computer Engineering, North Carolina State University, 2009.

[201 KB]
B. V. Iyer and T. M. Conte, "On Power and Energy Trends of IEEE 802.11n PHY," The 12th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM), Canary Islands, Spain, Oct. 26th-30th, 2009.

[392 KB]
B. V. Iyer, J. G. Beu, and T. M. Conte, "'Length Adaptive Processors: The solution for Energy/Performance Dilemma in Embedded Systems," Interact-13: Workshop on Interaction Between Compilers and Computer Architecture (Held in Conjunction with HPCA), Raleigh, NC, Feb 16th, 2009.

[237 KB]
B. V. Iyer, J. A. Poovey, and T. M. Conte, "Energy-Aware Opcode Design," 23rd International Conference on Computer Design, Lake Tahoe, CA, Oct 12-15, 2008.

[250 KB]
B. V. Iyer and T. M. Conte, "A Power Model for Register-Sharing Structures," 6th IFIP Conference on Parallel and Distributed Embedded Systems, Milano, Italy, Sept. 7-10, 2008.

[220 KB]
P. D. Bryan and T. M. Conte, "Combining Cluster Sampling with Single Pass Methods for Efficient Sampling Regimen Design," IEEE International Conference on Computer Design, Lake Tahoe, California, October 2007.

[128 KB]
P. D. Bryan, M. C. Rosier, and T. M. Conte, "Reverse State Reconstruction for Sampled Microarchitectural Simulation," IEEE International Symposium on Performance Analysis of Systems and Software, San Jose, California, April 2007.

[128 KB]
M. C. Rosier and T. M. Conte, "Treegion Instruction Scheduling in GCC," Proceedings of the 2006 GCC Developers Summit, (Ottawa, Canada), June 2006. (Slides from the talk given at GCC Developers' Summit '06).

[1.27 MB]
S. Sharma, J. G. Beu and T. M. Conte, "Spectral prefetcher: An effective mechanism for L2 cache prefetching," ACM Transactions on Architecture and Code Optimization, vol. 2 , no. 4, Dec. '05, pp. 423-450.

[1.46 MB]
E. Ozer and T. M. Conte, "High-performance and low-cost dual-thread VLIW processor using WELD architecture paradigm," IEEE Transactions on Parallel and Distributed Systems, vol. 16, no. 12, Dec. '05.

[2.06 MB]
H. Zhou and T. M. Conte, "Enhancing memory-level parallelism via recovery-free value prediction," IEEE Transactions on Computers, vol. C-54, no. 7, Jul. '05, pp. 897-912.

[1.1 MB]
P. Mehrotra, V. Rao, T. M. Conte and P. D. Franzon, "Optimal Chip Package Co-design for High Performance DSP," IEEE Transactions on Advanced Packaging, vol. 28, no. 2, May '05, pp. 288-297.

[303 KB]
A. Bechini, T. M. Conte, C. A. Prete, "Opportunities and challenges in embedded systems," IEEE Micro, July-August, '04, pp. 2-3.

[687 KB]
H. Zhou, M. C. Toburen, E. Rotenberg, and T. M. Conte, "Adaptive Mode Control: A Static-Power-Efficient Cache Design," ACM Transactions in Embedded Computing Systems (TECS), vol. 2, no. 3, Aug. '03, pp. 347-372.

[130 KB]
H. Zhou and T.M. Conte, "Enhancing Memory Level Parallelism via Recovery-Free Value Prediction," Proceedings of the 2003 International Conference on Supercomputing (ICS'03) (San Francisco, CA), June 2003. (Slides from the talk given at ICS'03).

[143 KB]
H. Zhou, J. Flanagan, and T.M. Conte, "Detecting Global Stride Locality in Value Streams," Proceedings of the 30th International Symposium on Computer Architecture (ISCA-30) (San Diego, CA), June 2003. (Slides from the talk given at ISCA-30).

[2.4 MB]
C. Fu, J.T. Bodine, and T.M. Conte, "Modeling Value Speculation: An Optimal Edge Selection Problem," IEEE Transactions on Computers, vol. 52, no. 3, March 2003.

[228 KB]
J.T. Bodine, "Exploiting Computational Locality in Global Value Histories," Master's thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 2002.

[133 KB]
H. Zhou and T.M. Conte, "Code Size Efficiency in Global Scheduling for ILP Processors," Proceedings of the 6th Annual Workshop on the Interaction between Compilers and Computer Architectures (INTERACT-6) held in conjunction with the 8th International Symposium on High Performance Computer Architecture (HPCA-8) (Cambridge, MA), February 2002. (Slides from the talk given at INTERACT-6).

[149 KB]
H. Zhou and T.M. Conte, "Code Size Efficiency in Global Scheduling for VLIW/EPIC Style Embedded Processors," Technical Report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27695-7914, January 2002.

[355 KB]
E. Ozer, "Architectural and Compiler Issues For Tolerating Latencies in Horizontal Architectures," Ph.D. thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 2001.

[368 KB]
C. Fu, "Compiler-Driven Value Speculation Scheduling," Ph.D. thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 2001.

[73 KB]
E. Ozer, T. M. Conte, and S. Sharma, "Weld: A Multithreading Technique Towards Latency-Tolerant VLIW Processors," Proceedings of the 8th International Conference on High Performance Computing (HiPC'01) (Hyderabad, India), December 2001.
Read about our Weld research in EE Times

[223 KB]
M.D. Jennings, H. Zhou, and T.M. Conte, "A Treegion-based Unified Approach to Speculation and Predication in Global Instruction Scheduling," Technical Report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27695-7914, August 2001.

[85 KB]
H. Zhou, M. D. Jennings, and T. M. Conte, "Tree Traversal Scheduling: A Global Scheduling Technique for VLIW/EPIC Processors," Proceedings of the 14th Annual Workshop on Languages and Compilers for Parallel Computing (LCPC'01) (Cumberland Falls, KY), August 2001. (Slides from the talk given at LCPC'01).

[116 KB]
H. Zhou, M. C. Toburen, E. Rotenberg, and T. M. Conte, "Adaptive Mode Control: A Static-Power-Efficient Cache Design," Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT'01) (Barcelona, Spain), Sept. 2001. (Slides from the talk given at PACT'01).

[589 KB]
K. M. Hazelwood, M. C. Toburen, and T. M. Conte, "A Case for Exploiting Memory-Access Persistence," Proceedings of the 2001 Workshop on Memory Performance Issues held in conjunction with the 2001 International Symposium on Computer Architecture (Gothenburg, Sweden), June 2001. (Slides from the talk given at MPI'01).

[471 KB]
H. Zhou, M. C. Toburen, E. Rotenberg, and T. M. Conte, "Adaptive Mode Control: A Static-Power-Efficient Cache Design." Technical Report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27695-7914, Nov. 2000.

[100 KB]
K. M. Hazelwood and T. M. Conte, "A Lightweight Algorithm for Dynamic If-Conversion during Dynamic Optimization," Proceedings of the 2000 International Conference on Parallel Architectures and Compilation Techniques (PACT'00) (Philadelphia, PA), Oct. 2000, pp. 71-80. (Slides from the talk given at PACT'00).

[878 KB]
S. Y. Larin, "Exploiting Program Redundancy to Improve Performance, Cost and Power Consumption in Embedded Systems," Ph.D. thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 2000.

[438 KB]
K. M. Hazelwood, "Dynamic Optimization Infrastructure and Algorithms for IA-64," Master's thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 2000.

[367 KB]
V. S. Rao, "IA-64 Code Generation," Master's thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 2000.

[270 KB]
T. M. Conte, K.N. Menezes, S. W. Sathaye, and M.C. Toburen "System-Level Power Consumption Modeling and Tradeoff Analysis Techniques for Superscalar Processor Design," IEEE Trans. on VLSI Systems, vol 8, no. 2, April 2000.

[505 KB]
M. C. Toburen, "Power analysis and instruction scheduling for reduced di/dt in the execution core of high-performance microprocessors," Master's thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, August 1999.

[113 KB]
S. Y. Larin and T. M. Conte, "Compiler-driven cached code compression schemes for embedded ILP processors," Proceedings of the 32nd Annual International Symposium on Microarchitecture, (Haifa, Isreal), Nov. 1999. (Slides from the talk given at Micro-32).

[240 KB]
E. Ozer and T. M. Conte, "Unified Cluster Assignment and Instruction Scheduling for Clustered VLIW Microarchitectures," Technical Report, Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27695-7911, Jan. 1999.

[381 KB]
S. Banerjia, K. N. Menezes, S. W. Sathaye and T.M. Conte, "MPS: Miss-path scheduling for multiple issue processors." IEEE Trans. Computers, vol 47, no. 12, Dec. 1998.

[59 KB]
C. Fu and T. M. Conte "Value speculation mechanisms for EPIC architectures." Technical Report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC, 27695-7911, Oct. 1998.

[16 KB]
T. M. Conte, "Evolutionary compilation to long instruction superscalar microarchitectures for exploiting parallelism at all levels," abstract for presentation to "Wild and Crazy Ideas Session," 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Oct 6, 1998.

[301 KB]
E. Ozer, S. Banerjia, T. M. Conte, "Unified assign and schedule: A new approach to scheduling for clustered register file microarchitectures," Proceedings of the 31st Annual International Symposium on Microarchitecture, (Dallas, TX), Nov. 1998.

[155 KB]
M. D. Jennings and T. M. Conte, "Subword extensions for video processing on mobile systems," IEEE Concurrency, July-September, 1998, pp. 13-16.

[59 KB]
C. Fu, M. D. Jennings, S. Y. Larin and T. M. Conte, "Software-Only Value Speculation Scheduling." Technical Report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, June, 1998.

[59 KB]
E. Ozer, S. W. Sathaye, K. N. Menezes, S. Banerjia, M. D. Jennings and T. M. Conte, "A fast interrupt handling scheme for VLIW processors," Proceedings of the 1998 International Conference on Parallel Architectures and Compilation Techniques (PACT'98) (Paris, France), Oct. 1998.

[999 KB]
S. W. Sathaye, "Evolutionary compilation for code compatibility and performance," Ph.D. thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 1998.

[91 KB]
C. Fu, M. D. Jennings, S. Y. Larin and T. M. Conte, "Value Speculation Scheduling for High Performance Processors," Proceedings of the 8th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), (San Jose, CA), Oct. 4-7, 1998.

[31 KB]
M. C. Toburen, T. M. Conte, and M. Reilly, "Instruction Scheduling for Low Power Dissipation in High Performance Microprocessors." Presented at the Power Driven Microarchitecture Workshop in conjunction with the 25th International Symposium on Computer Architecture (ISCA'98) (Barcelona, Spain), June 1998.

[265 KB]
W. A. Havanki, S. Banerjia and T. M. Conte, "Treegion scheduling for wide-issue processors," Proceedings of the 4th International Symposium on High-Performance Computer Architecture (HPCA-4), (Las Vegas), Feb. 1998.

[491 KB]
K. N. P. Menezes, "Hardware-based profiling for program optimization," Ph.D. thesis, Department of Electrical and Computer Engineering, North Carolina State University, Raleigh, North Carolina, 1997.

[88 KB]
T. M. Conte, P. K. Dubey, M. D. Jennings, R. B. Lee, S. Rathnam, M. Schlansker, P. Song, A. Wolfe, "Challenges to combining general-purpose and multimedia processors," IEEE Computer, Dec. 1997.

[423 KB]
W. A. Havanki. "Treegion scheduling for VLIW processors," Master's thesis, Dept. Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, July 1997.

[196 KB]
K. N. Menezes, S. W. Sathaye and T. M. Conte, "Path prediction for high issue-rate processors," Proceedings of the 1997 International Conference on Parallel Architectures and Compilation Techniques (PACT'97), (San Francisco), Nov. 1997.

[166 KB]
T. M. Conte, M. A. Hirsch and W. W. Hwu, "Combining Trace Sampling With Single Pass Methods for Efficient Cache Simulation," IEEE Transactions on Computers, vol. C-47, no. 6, Jun. 1998.
N/A T. M. Conte, and S. W. Sathaye, "Optimization of VLIW Compatibility Systems Employing Dynamic Rescheduling," International Journal of Parallel Programming, vol. 25, no. 2, Feb. 1997.

[412 KB]
T. M. Conte, S. Banerjia, S. Y. Larin, K. N. Menezes and S. W. Sathaye, "Instruction cache designs for a class of statically scheduled instruction level parallel architectures." Technical report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, May 1997. (In review).

[328 KB]
T.M. Conte and S. W. Sathaye, "Properties of rescheduling size invariance for dynamic rescheduling-based VLIW cross-generation compatability." Technical report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, April 1997. (In review).

[190 KB]
S. Banerjia, W.A. Havanki and T.M. Conte, "Treegion scheduling for highly parallel processors," Proceedings of the 3rd International Euro-Par Conference (Euro-Par'97), (Passau, Germany), pp.1074-1078, Aug. 1997.

[195 KB]
T. M. Conte, S. Banerjia, S. Y. Larin, K. N. Menezes and S. W. Sathaye, "Instruction fetch mechanisms for VLIW architectures with compressed encodings," Proceedings of the 29th Annual International Symposium on Microarchitecture, (Paris, France), pp.201-211, Dec. 1996.

[128 KB]
T. M. Conte, K. N. Menezes and M. A. Hirsch, "Accurate and practical profile-driven compilation using the profile buffer," Proceedings of the 29th Annual International Symposium on Microarchitecture, (Paris, France), pp.36-45, Dec. 1996.

[1.2 MB]
T. M. Conte, S. W. Sathaye and S. Banerjia, "A persistent rescheduled-page cache for low overhead object code compatibility in VLIW architectures," Proceedings of the 29th Annual International Symposium on Microarchitecture, (Paris, France), pp.4-13, Dec. 1996.

[206 KB]
S. Banerjia, K. N. Menezes, and T. M. Conte, "NextPC computation for a banked instruction cache for a VLIW architecture with a compressed encoding." Technical report. Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, June 1996.
(Note: This report was previously titled "NextPC computation for a banked instruction cache").

[209 KB]
T. M. Conte, M. A. Hirsch, and K. N. Menezes, "Reducing state loss for effective trace sampling of superscalar processors," Proceedings of the 1996 International Conference on Computer Design, (Austin, TX), Oct. 1996.

[214 KB]
T. M. Conte, B. A. Patel, K. N. Menezes and J. S. Cox, "Hardware-based profiling: An effective technique for profile-driven optimization," International Journal of Parallel Programming, vol. 24, no. 2, Feb. 1996.

[130 KB]
A. Singla and T. M. Conte, "Bipartitioning for hybrid FPGA-software simulation," Proceedings of the 1996 IEEE International Converence on VLSI Design, (Bangalore, India), Jan. 1996.

[120 KB]
M. D. Jennings, "Multimedia Extensions to TINKER." Technical Report, Dept. of Electrical and Computer Engineering, North Carolina State University, Raleigh, NC 27695-7911, June 1995.
N/A T. M. Conte and C. E. Gimarc, eds., Fast Simulation of Computer Architectures, Kluwer Academic Publishers: Boston, MA, 1995, ISBN 0-7923-9593-X.

[234 KB]
T. M. Conte and S. W. Sathaye, "Dynamic rescheduling: A technique for object code compatibility in VLIW architectures," Proceedings of the 28th Annual International Symposium on Microarchitecture, (Ann Arbor, MI), Nov. 1995.

[378 KB]
T. M. Conte, K. N. Menezes, P. M. Mills and B. A. Patel, "Optimization of instruction fetch mechanisms for high issue rates," Proceedings of the 22nd Annual International Symposium on Computer Architecture, (Santa Margherita, Italy), Jun. 1995.

[139 KB]
J. S. Cox, D. P. Howell, and T. M. Conte, "Commercializing profile-driven optimization," Proceedings of the 28th Hawaii International Conference on System Sciences, vol. 1, (Maui, HI), pp. 221-228, Jan. 1995.

[224 KB]
T. M. Conte, K. N. P. Menezes and S. A. Sathaye, "A technique to determine power-efficient, high-performance superscalar processors," Proceedings of the 28th Hawaii International Conference on System Sciences, vol. 1, (Maui, HI), pp. 324-333, Jan. 1995.
N/A T. M. Conte, "Superscalar and VLIW Processors," in Handbook of Parallel and Distributed Computing, (A. Y. Zomaya, ed.), McGraw-Hill: New York, 1995. [McGraw-Hill copyright protected-- buy the book.]
N/A The TINKER Machine Language Manual, North Carolina State University, 1995. [Contact a group member to obtain a copy]

[225 KB]
T. M. Conte, B. A. Patel, and J. S. Cox, "Using branch handling hardware to support profile-driven optimization," Proceedings of the 27th Annual International Symposium on Microarchitecture, (San Jose, CA), Dec. 1994.

[833 KB]
T. M. Conte, "Tradeoffs in processor/memory interfaces for superscalar processors," Proceedings of the 25th Annual International Symposium on Microarchitecture, (Portland, OR), pp. 202--205, Dec. 1992.