Multicore Application Programming For Windows, Linux, and Oracle Solaris

Important questions and answers, Online Study Material, Lecturing Notes, Assignment, Reference, Wiki

Multicore Application Programming For Windows, Linux, and Oracle Solaris


Multicore Application Programming For Windows, Linux, and Oracle Solaris



Chapter 1 : Hardware and Processes and Threads


Hardware, Processes, and Threads
Examining the Insides of a Computer
The Motivation for Multicore Processors
Supporting Multiple Threads on a Single Chip
Increasing Instruction Issue Rate with Pipelined Processor Cores
Using Caches to Hold Recently Used Data
Using Virtual Memory to Store Data
Translating from Virtual Addresses to Physical Addresses
The Characteristics of Multiprocessor Systems
How Latency and Bandwidth Impact Performance
The Translation of Source Code to Assembly Language
The Performance of 32-Bit versus 64-Bit Code
Ensuring the Correct Order of Memory Operations
The Differences Between Processes and Threads

Chapter 2 : Coding for Performance


Coding for Performance
Defining Performance
Understanding Algorithmic Complexity
Why Algorithmic Complexity Is Important
Using Algorithmic Complexity with Care
How Structure Impacts Performance
Performance and Convenience Trade-Offs in Source Code and Build Structures
Using Libraries to Structure Applications
The Impact of Data Structures on Performance
The Role of the Compiler
The Two Types of Compiler Optimization
Selecting Appropriate Compiler Options
How Cross-File Optimization Can Be Used to Improve Performance
Using Profile Feedback
How Potential Pointer Aliasing Can Inhibit Compiler Optimizations
Identifying Where Time Is Spent Using Profiling
Commonly Available Profiling Tools
How Not to Optimize
Performance by Design

Chapter 3 : Identifying Opportunities for Parallelism


Identifying Opportunities for Parallelism
Using Multiple Processes to Improve System Productivity
Multiple Users Utilizing a Single System
Improving Machine Efficiency Through Consolidation
Using Containers to Isolate Applications Sharing a Single System
Hosting Multiple Operating Systems Using Hypervisors
Using Parallelism to Improve the Performance of a Single Task
One Approach to Visualizing Parallel Applications
How Parallelism Can Change the Choice of Algorithms
Amdahl’s Law
Determining the Maximum Practical Threads
How Synchronization Costs Reduce Scaling
Parallelization Patterns
Data Parallelism Using SIMD Instructions
Parallelization Using Processes or Threads
Multiple Independent Tasks
Multiple Loosely Coupled Tasks
Multiple Copies of the Same Task
Single Task Split Over Multiple Threads
Using a Pipeline of Tasks to Work on a Single Item
Division of Work into a Client and a Server
Splitting Responsibility into a Producer and a Consumer
Combining Parallelization Strategies
How Dependencies Influence the Ability Run Code in Parallel
Antidependencies and Output Dependencies
Using Speculation to Break Dependencies
Critical Paths
Identifying Parallelization Opportunities

Chapter 4 : Synchronization and Data Sharing


Synchronization and Data Sharing
Data Races
Using Tools to Detect Data Races
Avoiding Data Races
Synchronization Primitives
Mutexes and Critical Regions
Spin Locks
Semaphores
Readers-Writer Locks
Barriers
Atomic Operations and Lock-Free Code
Deadlocks and Livelocks
Communication Between Threads and Processes
Storing Thread-Private Data

Chapter 5 : Using POSIX Threads


Using POSIX Threads
Creating Threads
Compiling Multithreaded Code
Process Termination
Sharing Data Between Threads
Variables and Memory
Multiprocess Programming
Sockets
Reentrant Code and Compiler Flags
Windows Threading

Chapter 6 : Windows Threading


Creating Native Windows Threads
Terminating Threads
Creating and Resuming Suspended Threads
Using Handles to Kernel Resources
Methods of Synchronization and Resource Sharing
An Example of Requiring Synchronization Between Threads
Protecting Access to Code with Critical Sections
Protecting Regions of Code with Mutexes
Slim Reader/Writer Locks
Signaling Event Completion to Other Threads or Processes
Wide String Handling in Windows
Creating Processes
Sharing Memory Between Processes
Inheriting Handles in Child Processes
Naming Mutexes and Sharing Them Between Processes
Communicating with Pipes
Communicating Using Sockets
Atomic Updates of Variables
Allocating Thread-Local Storage
Setting Thread Priority

Chapter 7 : Using Automatic Parallelization and OpenMP


Using Automatic Parallelization and OpenMP
Using Automatic Parallelization to Produce a Parallel Application
Identifying and Parallelizing Reductions
Automatic Parallelization of Codes Containing Calls
Assisting Compiler in Automatically Parallelizing Code
Using OpenMP to Produce a Parallel Application
Using OpenMP to Parallelize Loops
Runtime Behavior of an OpenMP Application
Variable Scoping Inside OpenMP Parallel Regions
Parallelizing Reductions Using OpenMP
Accessing Private Data Outside the Parallel Region
Improving Work Distribution Using Scheduling
Using Parallel Sections to Perform Independent Work
Nested Parallelism
Using OpenMP for Dynamically Defined Parallel Tasks
Keeping Data Private to Threads
Controlling the OpenMP Runtime Environment
Waiting for Work to Complete
Restricting the Threads That Execute a Region of Code
Ensuring That Code in a Parallel Region Is Executed in Order
Collapsing Loops to Improve Workload Balance
Enforcing Memory Consistency
An Example of Parallelization

Chapter 8 : Hand Coded Synchronization and Sharing


Hand-Coded Synchronization and Sharing
Atomic Operations
Using Compare and Swap Instructions to Form More Complex Atomic Operations
Enforcing Memory Ordering to Ensure Correct Operation
Compiler Support of Memory-Ordering Directives
Reordering of Operations by the Compiler
Volatile Variables
Operating System–Provided Atomics
Lockless Algorithms
Dekker’s Algorithm
Producer-Consumer with a Circular Buffer
Scaling to Multiple Consumers or Producers
Scaling the Producer-Consumer to Multiple Threads
Modifying the Producer-Consumer Code to Use Atomics
The ABA Problem

Chapter 9 : Scaling with Multicore Processors


Scaling with Multicore Processors
Constraints to Application Scaling
Hardware Constraints to Scaling
Bandwidth Sharing Between Cores
False Sharing
Cache Conflict and Capacity
Pipeline Resource Starvation
Operating System Constraints to Scaling
Multicore Processors and Scaling

Chapter 10 : Other Parallelization Technologies


Other Parallelization Technologies
GPU-Based Computing
Language Extensions
Alternative Languages
Clustering Technologies
Transactional Memory
Vectorization



Privacy Policy, Terms and Conditions, DMCA Policy and Compliant, Contact

Copyright © 2018-2023 BrainKart.com; All Rights Reserved. Developed by Therithal info, Chennai.