This course is intended for students interested in the efficient use of modern parallel systems ranging from multi-core and many-core processors to large-scale distributed memory clusters. The course puts equal emphasis on the theoretical foundations of parallel computing and practical aspects of different parallel programming models. It begins with a survey of common parallel architectures and types of parallelism, and then follows with an overview of formal approaches to assess scalability and efficiency of parallel algorithms and their implementations. In the second part, the course covers the most common and current parallel programming techniques and APIs, including for shared address space, many-core accelerators, distributed memory clusters and big data analytics platforms. Each component of the course involves solving practical computational and data driven problems, ranging from basic algorithms like sorting or searching, to graphs and numerical data analysis.