Share
VIDEOS 1 TO 50
Multi Threading Performance - Georgia Tech - HPCA: Part 5
Multi Threading Performance - Georgia Tech - HPCA: Part 5
Published: 2015/02/23
Channel: Udacity
Multithreading
Multithreading
Published: 2015/08/12
Channel: Computer Architecture
Multithreading (computer architecture)
Multithreading (computer architecture)
Published: 2016/01/22
Channel: WikiAudio
Difference between Multiprocessing and Multithreading
Difference between Multiprocessing and Multithreading
Published: 2016/09/05
Channel: codebasics
Mod-25 Lec-37 Multithreading and Multiprocessing
Mod-25 Lec-37 Multithreading and Multiprocessing
Published: 2015/01/06
Channel: nptelhrd
What is Thread (Computer Science)
What is Thread (Computer Science)
Published: 2016/05/10
Channel: 2MinsTutorials
Operating System Threads | Single, Multithreaded Process, benefits| Multithreading Models
Operating System Threads | Single, Multithreaded Process, benefits| Multithreading Models
Published: 2017/02/28
Channel: Easy Engineering Classes
Multithreaded Architecture
Multithreaded Architecture
Published: 2012/07/08
Channel: Game Engine Architecture Club
CS6810 -- Lecture 34. Lectures on Hardware-Based ILP.
CS6810 -- Lecture 34. Lectures on Hardware-Based ILP.
Published: 2012/09/30
Channel: Rajeev Balasubramonian
Difference Between Process and Thread - Georgia Tech - Advanced Operating Systems
Difference Between Process and Thread - Georgia Tech - Advanced Operating Systems
Published: 2015/02/23
Channel: Udacity
How Do CPUs Use Multiple Cores?
How Do CPUs Use Multiple Cores?
Published: 2016/06/03
Channel: Techquickie
22-Multiple Issue Processors I
22-Multiple Issue Processors I
Published: 2016/05/24
Channel: Vidya-mitra
What is Hyper Threading Technology as Fast As Possible
What is Hyper Threading Technology as Fast As Possible
Published: 2013/09/24
Channel: Techquickie
Introduction to Multiprocessors (CS)
Introduction to Multiprocessors (CS)
Published: 2016/10/27
Channel: Vidya-mitra
Trace Scheduling, Predication, Multithreading, Computer Architecture Lec 12/16
Trace Scheduling, Predication, Multithreading, Computer Architecture Lec 12/16
Published: 2016/08/23
Channel: Renzym Education
What Is A Multithreading Operating System?
What Is A Multithreading Operating System?
Published: 2017/08/18
Channel: Til Til
Intel Multicore Hyperthreading
Intel Multicore Hyperthreading
Published: 2015/03/09
Channel: Tom Guarino
Mod-26 Lec-38 Simultanoues Multithreading
Mod-26 Lec-38 Simultanoues Multithreading
Published: 2015/01/06
Channel: nptelhrd
24-Thread Level Parallelism – SMT and CMP
24-Thread Level Parallelism – SMT and CMP
Published: 2016/05/25
Channel: Vidya-mitra
Carnegie Mellon - Parallel Computer Architecture 2013 - Onur Mutlu - Lec 9 - Multithreading
Carnegie Mellon - Parallel Computer Architecture 2013 - Onur Mutlu - Lec 9 - Multithreading
Published: 2013/11/19
Channel: Carnegie Mellon Computer Architecture
33  Multithreading in Uniprocessors
33 Multithreading in Uniprocessors
Published: 2012/10/02
Channel: Niranjan reddy
Multi-Threading for Game Development
Multi-Threading for Game Development
Published: 2014/09/24
Channel: Game Engine Architecture Club
Chapt 11: Multithreading and vector processing, Part 4/5 (Smruti Sarangi)
Chapt 11: Multithreading and vector processing, Part 4/5 (Smruti Sarangi)
Published: 2017/02/12
Channel: Smruti R. Sarangi
What Is A Multithreading Operating System?
What Is A Multithreading Operating System?
Published: 2017/07/08
Channel: sparky DOUBTS
Parallel Computing Explained In 3 Minutes
Parallel Computing Explained In 3 Minutes
Published: 2014/12/19
Channel: Hooman
Carnegie Mellon - Parallel Computer Architecture 2013 - Onur Mutlu - Lec 13-Multi-threading II
Carnegie Mellon - Parallel Computer Architecture 2013 - Onur Mutlu - Lec 13-Multi-threading II
Published: 2013/09/21
Channel: Carnegie Mellon Computer Architecture
What is Simultaneous Multithreading?
What is Simultaneous Multithreading?
Published: 2015/10/26
Channel: Jake Thomann
Multithreaded Meaning
Multithreaded Meaning
Published: 2015/04/19
Channel: SDictionary
parallel computing and types of architecture in hindi
parallel computing and types of architecture in hindi
Published: 2017/02/19
Channel: Last moment tuitions
Carnegie Mellon - Parallel Computer Architecture 2013 - Onur Mutlu - Lec 9 - Multithreading
Carnegie Mellon - Parallel Computer Architecture 2013 - Onur Mutlu - Lec 9 - Multithreading
Published: 2013/09/21
Channel: Carnegie Mellon Computer Architecture
Lecture 12 Multithreading Programming
Lecture 12 Multithreading Programming
Published: 2015/05/06
Channel: Pinal Shah
Carnegie Mellon -Parallel Computer Architecture 2012 - Onur Mutlu - Lecture 10 - Multithreading II
Carnegie Mellon -Parallel Computer Architecture 2012 - Onur Mutlu - Lecture 10 - Multithreading II
Published: 2013/09/21
Channel: Carnegie Mellon Computer Architecture
Deadlock and Livelock Explained
Deadlock and Livelock Explained
Published: 2016/12/10
Channel: AR-Embedded
Mod-01 Lec-36 Simultaneous multithreading, multi-cores
Mod-01 Lec-36 Simultaneous multithreading, multi-cores
Published: 2014/12/16
Channel: nptelhrd
Lecture 27. Multiprocessors - Carnegie Mellon - Computer Architecture 2015 - Onur Mutlu
Lecture 27. Multiprocessors - Carnegie Mellon - Computer Architecture 2015 - Onur Mutlu
Published: 2015/04/07
Channel: Carnegie Mellon Computer Architecture
What Is A Multithreading Operating System?
What Is A Multithreading Operating System?
Published: 2017/07/19
Channel: best sparky
Threads in operating system
Threads in operating system
Published: 2015/08/04
Channel: Abhay Agrawal
Multiprocessing Multitasking Multi-programming Multi-threading(Hindi)
Multiprocessing Multitasking Multi-programming Multi-threading(Hindi)
Published: 2017/05/04
Channel: COMAT WORLD
Instruction Level Parallelism (ILP)
Instruction Level Parallelism (ILP)
Published: 2016/03/05
Channel: Dave Xiang
Leveraging Fine-Grained Multithreading for Efficient SIMD Control Flow
Leveraging Fine-Grained Multithreading for Efficient SIMD Control Flow
Published: 2016/09/06
Channel: Microsoft Research
Thread Assignment in Multicore/Multithreaded Processors: A Statistical Approach
Thread Assignment in Multicore/Multithreaded Processors: A Statistical Approach
Published: 2015/12/08
Channel: ieeeComputerSociety
Multithreading in operating system
Multithreading in operating system
Published: 2017/03/21
Channel: GATE/UGC NET Concepts
Node.js Tutorial for Beginners - 4 - Handling Multiple Requests
Node.js Tutorial for Beginners - 4 - Handling Multiple Requests
Published: 2015/04/01
Channel: thenewboston
Multiprocessing (SMP and SMT) Internals in OKL4
Multiprocessing (SMP and SMT) Internals in OKL4
Published: 2008/09/22
Channel: brennaw
Exploiting Data Level Parallelism (CS)
Exploiting Data Level Parallelism (CS)
Published: 2016/10/27
Channel: Vidya-mitra
Dyanamic Scheduling With Speculation (CS)
Dyanamic Scheduling With Speculation (CS)
Published: 2016/01/02
Channel: Vidya-mitra
XBnPC - AMD
XBnPC - AMD's Zen Architecture!! Ryzen vs Kaby Lake & Broadwell-E - Part 2!!
Published: 2016/08/20
Channel: XBnPC
Advanced Topics in Programming Languages: The Java Memory Model
Advanced Topics in Programming Languages: The Java Memory Model
Published: 2012/08/22
Channel: GoogleTalksArchive
GCAP 2016: Parallel Game Engine Design - Brooke Hodgman
GCAP 2016: Parallel Game Engine Design - Brooke Hodgman
Published: 2016/12/21
Channel: GCAP
Mod-06 Lec-07 Instruction Pipelining
Mod-06 Lec-07 Instruction Pipelining
Published: 2015/01/06
Channel: nptelhrd
NEXT
GO TO RESULTS [51 .. 100]

WIKIPEDIA ARTICLE

From Wikipedia, the free encyclopedia
Jump to: navigation, search
A process with two threads of execution, running on a single processor

In computer architecture, multithreading is the ability of a central processing unit (CPU) or a single core in a multi-core processor to execute multiple processes or threads concurrently, appropriately supported by the operating system. This approach differs from multiprocessing, as with multithreading the processes and threads share the resources of a single or multiple cores: the computing units, the CPU caches, and the translation lookaside buffer (TLB).

Where multiprocessing systems include multiple complete processing units, multithreading aims to increase utilization of a single core by using thread-level as well as instruction-level parallelism. As the two techniques are complementary, they are sometimes combined in systems with multiple multithreading CPUs and in CPUs with multiple multithreading cores.

Overview[edit]

The multithreading paradigm has become more popular as efforts to further exploit instruction-level parallelism have stalled since the late 1990s. This allowed the concept of throughput computing to re-emerge from the more specialized field of transaction processing; even though it is very difficult to further speed up a single thread or single program, most computer systems are actually multitasking among multiple threads or programs. Thus, techniques that improve the throughput of all tasks result in overall performance gains.

Two major techniques for throughput computing are multithreading and multiprocessing.

Advantages[edit]

If a thread gets a lot of cache misses, the other threads can continue taking advantage of the unused computing resources, which may lead to faster overall execution as these resources would have been idle if only a single thread were executed. Also, if a thread cannot use all the computing resources of the CPU (because instructions depend on each other's result), running another thread may prevent those resources from becoming idle.

If several threads work on the same set of data, they can actually share their cache, leading to better cache usage or synchronization on its values.

Disadvantages[edit]

Multiple threads can interfere with each other when sharing hardware resources such as caches or translation lookaside buffers (TLBs). As a result, execution times of a single thread are not improved but can be degraded, even when only one thread is executing, due to lower frequencies or additional pipeline stages that are necessary to accommodate thread-switching hardware.

Overall efficiency varies; Intel claims up to 30% improvement with its Hyper-Threading Technology,[1] while a synthetic program just performing a loop of non-optimized dependent floating-point operations actually gains a 100% speed improvement when run in parallel. On the other hand, hand-tuned assembly language programs using MMX or AltiVec extensions and performing data prefetches (as a good video encoder might) do not suffer from cache misses or idle computing resources. Such programs therefore do not benefit from hardware multithreading and can indeed see degraded performance due to contention for shared resources.

From the software standpoint, hardware support for multithreading is more visible to software, requiring more changes to both application programs and operating systems than multiprocessing. Hardware techniques used to support multithreading often parallel the software techniques used for computer multitasking. Thread scheduling is also a major problem in multithreading.

Types of multithreading[edit]

Interleaved/Temporal multithreading[edit]

Coarse-grained multithreading[edit]

The simplest type of multithreading occurs when one thread runs until it is blocked by an event that normally would create a long-latency stall. Such a stall might be a cache miss that has to access off-chip memory, which might take hundreds of CPU cycles for the data to return. Instead of waiting for the stall to resolve, a threaded processor would switch execution to another thread that was ready to run. Only when the data for the previous thread had arrived, would the previous thread be placed back on the list of ready-to-run threads.

For example:

  1. Cycle i: instruction j from thread A is issued.
  2. Cycle i + 1: instruction j + 1 from thread A is issued.
  3. Cycle i + 2: instruction j + 2 from thread A is issued, which is a load instruction that misses in all caches.
  4. Cycle i + 3: thread scheduler invoked, switches to thread B.
  5. Cycle i + 4: instruction k from thread B is issued.
  6. Cycle i + 5: instruction k + 1 from thread B is issued.

Conceptually, it is similar to cooperative multi-tasking used in real-time operating systems, in which tasks voluntarily give up execution time when they need to wait upon some type of the event. This type of multithreading is known as block, cooperative or coarse-grained multithreading.

The goal of multithreading hardware support is to allow quick switching between a blocked thread and another thread ready to run. To achieve this goal, the hardware cost is to replicate the program visible registers, as well as some processor control registers (such as the program counter). Switching from one thread to another thread means the hardware switches from using one register set to another; to switch efficiently between active threads, each active thread needs to have its own register set. For example, to quickly switch between two threads, the register hardware needs to be instantiated twice.

Additional hardware support for multithreading allows thread switching to be done in one CPU cycle, bringing performance improvements. Also, additional hardware allows each thread to behave as if it were executing alone and not sharing any hardware resources with other threads, minimizing the amount of software changes needed within the application and the operating system to support multithreading.

Many families of microcontrollers and embedded processors have multiple register banks to allow quick context switching for interrupts. Such schemes can be considered a type of block multithreading among the user program thread and the interrupt threads.[citation needed]

Interleaved multithreading[edit]

The purpose of interleaved multithreading is to remove all data dependency stalls from the execution pipeline. Since one thread is relatively independent from other threads, there is less chance of one instruction in one pipelining stage needing an output from an older instruction in the pipeline. Conceptually, it is similar to preemptive multitasking used in operating systems; an analogy would be that the time slice given to each active thread is one CPU cycle.

For example:

  1. Cycle i + 1: an instruction from thread B is issued.
  2. Cycle i + 2: an instruction from thread C is issued.

This type of multithreading was first called barrel processing, in which the staves of a barrel represent the pipeline stages and their executing threads. Interleaved, preemptive, fine-grained or time-sliced multithreading are more modern terminology.

In addition to the hardware costs discussed in the block type of multithreading, interleaved multithreading has an additional cost of each pipeline stage tracking the thread ID of the instruction it is processing. Also, since there are more threads being executed concurrently in the pipeline, shared resources such as caches and TLBs need to be larger to avoid thrashing between the different threads.

Simultaneous multithreading[edit]

The most advanced type of multithreading applies to superscalar processors. Whereas a normal superscalar processor issues multiple instructions from a single thread every CPU cycle, in simultaneous multithreading (SMT) a superscalar processor can issue instructions from multiple threads every CPU cycle. Recognizing that any single thread has a limited amount of instruction-level parallelism, this type of multithreading tries to exploit parallelism available across multiple threads to decrease the waste associated with unused issue slots.

For example:

  1. Cycle i: instructions j and j + 1 from thread A and instruction k from thread B are simultaneously issued.
  2. Cycle i + 1: instruction j + 2 from thread A, instruction k + 1 from thread B, and instruction m from thread C are all simultaneously issued.
  3. Cycle i + 2: instruction j + 3 from thread A and instructions m + 1 and m + 2 from thread C are all simultaneously issued.

To distinguish the other types of multithreading from SMT, the term "temporal multithreading" is used to denote when instructions from only one thread can be issued at a time.

In addition to the hardware costs discussed for interleaved multithreading, SMT has the additional cost of each pipeline stage tracking the thread ID of each instruction being processed. Again, shared resources such as caches and TLBs have to be sized for the large number of active threads being processed.

Implementations include DEC (later Compaq) EV8 (not completed), Intel Hyper-Threading Technology, IBM POWER5, Sun Microsystems UltraSPARC T2, Cray XMT, and AMD Bulldozer and Zen microarchitectures.

Implementation specifics[edit]

A major area of research is the thread scheduler that must quickly choose from among the list of ready-to-run threads to execute next, as well as maintain the ready-to-run and stalled thread lists. An important subtopic is the different thread priority schemes that can be used by the scheduler. The thread scheduler might be implemented totally in software, totally in hardware, or as a hardware/software combination.

Another area of research is what type of events should cause a thread switch: cache misses, inter-thread communication, DMA completion, etc.

If the multithreading scheme replicates all of the software-visible state, including privileged control registers and TLBs, then it enables virtual machines to be created for each thread. This allows each thread to run its own operating system on the same processor. On the other hand, if only user-mode state is saved, then less hardware is required, which would allow more threads to be active at one time for the same die area or cost.

See also[edit]

References[edit]

External links[edit]

Disclaimer

None of the audio/visual content is hosted on this site. All media is embedded from other sites such as GoogleVideo, Wikipedia, YouTube etc. Therefore, this site has no control over the copyright issues of the streaming media.

All issues concerning copyright violations should be aimed at the sites hosting the material. This site does not host any of the streaming media and the owner has not uploaded any of the material to the video hosting servers. Anyone can find the same content on Google Video or YouTube by themselves.

The owner of this site cannot know which documentaries are in public domain, which has been uploaded to e.g. YouTube by the owner and which has been uploaded without permission. The copyright owner must contact the source if he wants his material off the Internet completely.

Powered by YouTube
Wikipedia content is licensed under the GFDL and (CC) license