Object-oriented programming (OOP) is a programming paradigm based on the concept of "objects", which can contain data and code: data in the form of fields (often known as attributes or properties), and code, in the form of procedures (often known as methods). Also this is the place to talk about Modularization, Packages etc.
Functional programming is a programming paradigm where programs are constructed by applying and composing functions. It is a declarative programming paradigm in which function definitions are trees of expressions that map values to other values, rather than a sequence of imperative statements which update the running state of the program.
Website security is the act/practice of protecting websites from unauthorized access, use, modification, destruction, or disruption (important principles of modern web security, security best practices, Content Security Policy, CORS, OWASP Security Risks). GDPR, hashing, encryption, SSL topics can be covered here as well.
HTML (Hypertext Markup Language) is the code that is used to structure a web page and its content. For example, content could be structured within a set of paragraphs, a list of bulleted points, or using images and data tables.
CSS (Cascading Style Sheets) is the language used to style an HTML document. CSS describes how HTML elements should be displayed.
Talking about Authorization and Authentification means to be knowledgeable about Cookies, Session, JWT, Token, OAuth. In simple terms, Authentification is the process of verifying who a user is, while Authorization is the process of verifying what they have access to.
GraphQL is a query language for your API, and a server-side runtime for executing queries using a type system you define for your data. GraphQL isn't tied to any specific database or storage engine and is instead backed by your existing code and data. GraphQL Best Practices (HTTP, JSON, Versioning, Nullability, Pagination, Server-side Batching and Caching)
JSON, YAML, XML (dom/sax/xpath), validation. For advance level is nice to know about schema evolution, about data serilization systems, platforms, frameworks, libraries, binary protocols - (Protobuf/Thrift, Kryo, Avro, Parquet)
Unix Compatible, Linux (Debian/Ubuntu, CentOS/RHEL), MS Windows, Mac OS X
Object-oriented design (OOD) is the process of using an object-oriented methodology to design a computing system or application. OOD Concepts: Coupling, Cohesion, Strong Encapsulation. OOD Principles: DRY, KISS, YAGNI, SOLID are software design principles, they are about clean code.
Software Development Life Cycle (SDLC) is a process used by the software industry to design, develop and test high-quality software. The SDLC aims to produce high-quality software that meets or exceeds customer expectations, reaches completion within times and cost estimates. There are a few of the most popular methodologies such as Scrum, Kanban, Waterfall, XP, but you can find another one or a mix of those.
The Open Systems Interconnection (OSI) model describes seven layers that computer systems use to communicate over a network.
IP address is the host identification number used for proper communication between devices. The IP address is a number assigned to a network interface, a group of interfaces (broadcast, multicast addresses) or to the entire computer network, used to identify network components and being one of the elements enabling them to communicate.
Network protocols are a set of rules, conventions, and data structures that dictate how devices exchange data across networks. In other words, network protocols can be equated to languages that two devices must understand for seamless communication of information, regardless of their infrastructure and design disparities.(DHCP, DNS, SSH, TCP/UDP, NTP, LDAP)
Short for virtual local area network, VLAN allows a network administrator to set up separate networks by configuring a network device, such as a router, without adjusting cabling. A VLAN allows a network to be divided, set up, and changed by a network administrator to organize and filter data accordingly.
Protocol - a set of rules or procedures for transmitting data between two or more entities of a communications system.
A web browser (commonly referred to as a browser) is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. Although browsers are primarily intended to use the World Wide Web, they can also be used to access information provided by web servers in private networks or files in file systems. (Render, parsing, optimization etc.)
A relational database is a type of database. It uses a structure that allows us to identify and access data in relation to another piece of data in the database. Often, data in a relational database is organized into tables. There are most popular of them — SQL Server, PostgreSQL, MariaDB, MySQL, SQLite, Oracle, etc.
A NoSQL (originally referring to "non-SQL" or "non-relational") database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Talking about NoSQL DB is nice to know about CAP/PACELC, Key-value, document-oriented, column-oriented, map-reduce, graph-db, real-time, multi-model. The are plenty of NoSQL DBs and management systems: MongoDB, Redis, LiteDB, Apache Cassandra, RavenDB, CouchDB, etc.
A cloud database is a database that typically runs on a cloud computing platform and access to the database is provided as-a-service. Depends of the cloud could be different databases: Azure CosmosDB, Amazon DynamoDB etc.
A powerful search engine behind your database helps the customers in better finding. Here is nice to know ElasticSearch, Solr, Lucene, Indexing/Tokenization, Indexes, Ranking, Heavy Load, etc.
Set of properties that a database transaction in a relational database is supposed to have (Atomicity, Consistency, Isolation, Durability)
A transaction can be defined as a group of tasks. A single task is the minimum processing unit which cannot be divided further.
The N+1 query problem happens when your code executes N additional query statements to fetch the same data that could have been retrieved when executing the primary query.
Database normalization is the process of structuring a database, usually a relational database, in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as part of his relational model.
A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure.
This skill is nice to have while working with high-loaded RDBMS. DB design means you know how to read, write optimizations, aware of Scalability, partitioning replication, indexing, and other stuff.
Alembic is a lightweight database migration tool for usage with the SQLAlchemy Database Toolkit for Python.
The basic Python data structures in Python include list, set, tuples, and dictionary. Each of the data structures is unique in its own way. Data structures are “containers” that organize and group data according to type.
Input/Output(I/O). I/O operations are very important when it comes to processing financial data or scientific data. Exception handling is a construct in some programming languages to handle or deal with errors automatically. Many programming languages like C++, Objective-C, PHP, Java, Ruby, Python, and many others have built-in support for exception handling.
A metaclass in Python is a class of a class that defines how a class behaves. A class is itself an instance of a metaclass. A class in Python defines how the instance of the class will behave.
slots provide a special mechanism to reduce the size of objects. It is a concept of memory optimization on objects.
A decorator in Python is a function that takes another function as its argument and returns yet another function.
A python generator function lends us a sequence of values to python iterate on. A Python iterator returns us an iterator object- one value at a time.
Multiple inheritance in python is when a class inherits from multiple classes. MRO (Method Resolution Order)
Global Interpreter Lock (GIL) in python is a process lock or a mutex used while dealing with the processes. It makes sure that one thread can access a particular resource at a time and it also prevents the use of objects and bytecodes at once.
The garbage collector is keeping track of all objects in memory. The Python garbage collector has three generations in total, and an object moves into an older generation whenever it survives a garbage collection process on its current generation.
PEP stands for Python Enhancement Proposal, and there are several of them. A PEP is a document that describes new features proposed for Python and documents aspects of Python, like design and style, for the community.
Python 2.7.0 was released on July 3rd, 2010. Python 2.7 is scheduled to be the last major version in the 2.x series before it moves into an extended maintenance period. This release contains many of the features that were first released in Python 3.1.
Releases of Python 3 include the 2to3 utility, which automates (at least partially) the translation of Python 2 code to Python 3.
The interpreter is a program code that reads the code and executes it one by one. It converts the code into a language that is compatible with computer hardware. It directly executes the instructions whether it is written in a programming language or scripting language.
Python is a modular language that imports most useful operations from the standard library. The Python Standard Library is a collection of script modules accessible to a Python program to simplify the programming process and removing the need to rewrite commonly used commands. They can be used by 'calling/importing' them at the beginning of a script.
Concurrency is the occurrence of two or more events at the same time. Concurrency is a natural phenomenon because many events occur simultaneously at any given time. Multithreading is the ability of a CPU to manage the use of operating system by executing multiple threads concurrently.
A traditional web server does not understand or have any way to run Python applications. There are a few approaches for running Python web applications on web servers. Here worth to talk about CGI, uwsgi, FastCGI, WSGI, ASGI.
In computer networks, a reverse proxy is a type of proxy server that retrieves resources on behalf of a client from one or more servers. These resources are then returned to the client, appearing as if they originated from the reverse proxy server itself. It is mainly used to balance load.
Gunicorn 'Green Unicorn' is a Python WSGI HTTP Server for UNIX. It's a pre-fork worker model. The Gunicorn server is broadly compatible with various web frameworks. Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework based on QEMU.
Caching is a data storage technique that is omnipresent throughout computer systems and plays an important role in designing scalable Internet applications. A cache is any data store that can store and retrieve data quickly for future use, enabling faster response times and decreasing load on other parts of your system. Besides caching strategies, caching issues, integrating cache, what to cache, invalidation strategies, there are some helpful caching tools: Memcached, Redis, Ignite, Hazelcast, Varnish, NGINX, etc.
A Web framework is a collection of packages or modules which allow developers to write Web applications (see WebApplications) or services without having to handle such low-level details as protocols, sockets or process/thread management. Some of the most popular are - Django, Flask/Sanic, aiohttp, FastAPI, Tornado/Twisted, Web2py, Pyramid, Pylons, TurboGears, Bottle, CherryPy, Web.py, Zope and others.
Celery is a task queue implementation for Python web applications used to asynchronously execute work outside the HTTP request-response cycle.
Python and Big Data is the new combination invading the market space now. Python is in great demand among Big Data companies. There are a lot of tools you can use for work with Big Data: Apache Spark, Apache Kafka, Apache Hadoop/MapReduce, Dask, Apache Hive, Apache Beam, ClickHouse, Apache Flink, Apache Tez, Apache Samza, etc.
Making decisions for the business, forecasting weather, studying protein structures in biology or designing a marketing campaign. All of these scenarios involve a multidisciplinary approach of using mathematical models, statistics, graphs, databases and of course the business or scientific logic behind the data analysis. Python shines bright as it has numerous libraries and built-in features which makes it easy to tackle the needs of Data Science. (NumPy, Pandas, Matplotlib, SciPy, SciKit-Learn, TensorFlow, Keras, Seaborn, PyTorch, NLTK, Gensim, Theano, MXNet).
Integration testing is used to test a group of individual modules, components, or pieces of units. The main purpose of Integration testing is to find bugs when two or more modules are integrated. To check how two or more modules, components, or a different piece of code are will work together.
Unlike unit testing, which focuses on individual modules and classes, end-to-end (e2e) testing covers the interaction of classes and modules at a more aggregate level - closer to the kind of interaction that end-users will have with the production system.
The main objective of unit testing is to isolate written code to test and determine if it works as intended. A unit test typically comprises three stages: plan, cases, and scripting, and the unit test itself.
There are a few libraries that help you with Python development: the debugger enables you to step through code, analyze stack frames and set breakpoints etc., and the profilers run code and give you a detailed breakdown of execution times, allowing you to identify bottlenecks in your programs. Auditing events provide visibility into runtime behaviors that would otherwise require intrusive debugging or patching.
It is worth spending a little bit of extra time to set up formatting and linting tools that will help keep your code clean and enforce good development practices. In Python there a few of them: PyLint, flake8, MyPy, pyflake, black, etc.
Coverage measurement is typically used to gauge the effectiveness of tests. It can show which parts of your code are being exercised by tests, and which are not. Coverage.py is a tool for measuring code coverage of Python programs.
Load testing is a great way to grab insights about how your application runs under heavy load, how all services interact, and to plan production capacity accordingly.
Test automation is the practice of running tests automatically, managing test data, and utilizing results to improve software quality. Automated testing is well-suited for large projects, projects that require testing the same areas over and over, and projects that have already been through an initial manual testing process.
“Test-driven development” refers to a style of programming in which three activities are tightly interwoven: coding, testing (in the form of writing unit tests) and design (in the form of refactoring).
In software engineering, behavior-driven development (BDD) is an agile software development process that encourages collaboration among developers, quality assurance testers, and customer representatives in a software project. BDD is an extension to the TDD concept, but instead of testing your code you are testing your product, and specifically that your product behaves as you desire it to.
At its core, the main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of what dependencies every other project has. The most used virtual environments and dependency managers in Python are Virtualenv, venv (Python 3.3), Pipenv, Poetry, Conda, Docker.
You may have heard about PyPI, setup.py, and wheel files. These are just a few of the tools Python’s ecosystem provides for distributing Python code to developers. If you heard previous things then most probably you know about Python Packaging Authority (PyPA) and their Python Packaging User Guide, which is the authoritative resource on how to package, publish and install Python projects using current tools.
Version control, also known as source control, is the practice of tracking and managing changes to software code. Version control systems are software tools that help software teams manage changes to source code over time. Few of the most popular — Git, Mercurial, SVN, etc.
Containers are a form of operating system virtualization. A single container might be used to run anything from a small microservice or software process to a larger application. Inside a container are all the necessary executables, binary code, libraries, and configuration files. Compared to server or machine virtualization approaches, however, containers do not contain operating system images. This makes them more lightweight and portable, with significantly less overhead. In larger application deployments, multiple containers may be deployed as one or more container clusters. Such clusters might be managed by a container orchestrator such as Kubernetes.
CI (Continuous Integration) and CD (Continuous Delivery) are part of the DevOps culture in which you combine development and operational processes into a single and collaborative workflow to make sure the two teams are on the same page. There are many tools and principles within this scope of activity: Jenkins, CodeShip, TeamCity, CircleCI, GitLab, Travis, VSTS, etc.; Deployment strategies: Rolling, Blue-Green, Canarry deployments etc.
Monitoring tools can proactively capture, analyze, trace, and display information related to developed applications. They provide full-stack visibility, which can ultimately help identify and fix application performance bottlenecks and improve the user’s experience. The most popular: Zabbix, Nagios, Prometheus, DataDog, NewRelic, Graphite/Graphana, etc.
Measuring performance provides an important metric to help you assess the success of your app, site, or web service. For example, you can use performance metrics to determine how your app performs in comparison to a competitor or you can compare your app's performance across releases. The metrics you choose to measure should be relevant to your users, site, and business goals. They should be collected and measured in a consistent manner and analyzed in a format that can be consumed and understood by non-technical stakeholders.
AWS Lambda is an event-driven, serverless computing platform provided by Amazon as a part of Amazon Web Services. It is a computing service that runs code in response to events and automatically manages the computing resources required by that code. It was introduced in November 2014.
Azure Functions is an event driven, compute-on-demand experience that extends the existing Azure application platform with capabilities to implement code triggered by events occurring in Azure or third party service as well as on-premises systems.
Google Cloud Functions is a serverless execution environment for building and connecting cloud services. With Cloud Functions you write simple, single-purpose functions that are attached to events emitted from your cloud infrastructure and services.
Heroku is a cloud platform as a service (PaaS) supporting several programming languages. One of the first cloud platforms, Heroku has been in development since June 2007, when it supported only the Ruby programming language, but now supports Java, Node.js, Scala, Clojure, Python, PHP, and Go.
DigitalOcean is a cloud infrastructure provider that provides cloud computing services to business entities. It is used to scale by deploying DigitalOcean applications that run parallel across multiple cloud servers without compromising performance.
PythonAnywhere is a web hosting platform and an Integrated Development Environment (IDE). It uses the Python language to power its technology.
Linode, LLC is an American privately-owned cloud hosting company that provides virtual private servers.
OpenStack is a free, open standard cloud computing platform. It is mostly deployed as infrastructure-as-a-service (IaaS) in both public and private clouds where virtual servers and other resources are made available to users.
The Rackspace Cloud is a set of cloud computing products and services billed on a utility computing basis from the US-based company Rackspace. Offerings include Cloud Storage ("Cloud Files"), virtual private server ("Cloud Servers"), load balancers, databases, backup, and monitoring.
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed in a certain order to get the desired output. From the data structure point of view, following are some important categories of algorithms: Search; Sort; Insert; Update; Delete.
Serverless is a cloud-native development model that allows developers to build and run applications without having to manage servers.
High availability architecture is an approach to defining the components, modules, or implementation of services of a system that ensures optimal operational performance, even at times of high loads. Although there are no fixed rules for implementing High Availability systems, there are generally a few good practices that one must follow so that you gain the most out of the least resources.
Design patterns represent the best practices used by experienced object-oriented software developers. Design patterns are solutions to general problems that software developers faced during software development. These solutions were obtained by trial and error by numerous software developers over quite a substantial period of time. Few of them for example - Factory, Strategy, Observer, Middlewares, Singeltone, State.
Architecture AntiPatterns focus on the system-level and enterprise-level structure of applications and components. If design patterns are the good guys, then the anti-patterns are the bad guys. And sometimes a good guy can turn into a bad guy. This happens in Hollywood movies, but it also happens in software engineering.
REST and SOAP are 2 different approaches to online data transmission. Specifically, both define how to build application programming interfaces (APIs), which allow data to be communicated between web applications.
The microservice architecture enables the rapid, frequent and reliable delivery of large, complex applications. It also enables an organization to evolve its technology stack.
Choosing the mode of communication is a fundamental decision that needs to be taken with great care. Services must handle requests from the application’s clients. Furthermore, services often collaborate to handle those requests. Consequently, they must use an inter-process communication protocol. This skill is about (Message-Broker (RabbitMQ, Apache Kafka, ActiveMQ, Azure Service Bus); Message-Bus (Distribus, BusMQ), etc.)
An Event-Driven Architecture for data and applications is a modern design approach centered around data that describes “events” (i.e., something that just happened). Examples of events include the taking of measurement, the pressing of a button, or the swiping of a credit card. In an event-driven architecture, decoupled applications can asynchronously publish and subscribe to events via an event broker (modern messaging-oriented-middleware).
Service-Oriented Architecture (SOA) is a stage in the evolution of application development and/or integration. It defines a way to make software components reusable using the interfaces.
Data-driven development in software engineering accepts the central role that data in its primary form takes in the applications that software developers create.