6

The Pythonista’s Guide to the OWASP Top 10

 1 year ago
source link: https://devm.io/python/python-owasp-app-security
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

The Pythonista’s Guide to the OWASP Top 10

If you're like many developers or IT professionals, you may have mixed feelings about these lists. I get it. On one hand, you know they're important for keeping your applications and systems secure. On the other hand, they can be overwhelming, dry, and difficult to implement.

But what if I told you that security checklists don't have to be that way? What if I showed you how to approach these lists in a whole new light, and make easy work of the items on them? That's exactly what this article is all about.

In the following article, I'll take you through a list of common security issues created by the Open Worldwide Application Security Project (in case you missed it, they recently changed their name).

I'll break down each item and show you a few practical ways you can start addressing them. Look for things that apply to you and your work, and jot those down for later. When it comes time to implement these elements into your design and your codebase, you’ll be able to make quick work of it – and you’ll be able to make a snazzy PowerPoint showing your bosses why they are lucky to have you on the team.

OWASP Top 10: One List to Secure Them All

#1 - Broken Access Control

Access control enforces policy such that users cannot act outside of their intended permissions. Failures typically lead to unauthorized information disclosure, modification, or destruction of all data or performing a business function outside the user's limits.

A01:2021-Broken Access Control

When does this apply to a Pythonista?

This will apply if you’re providing an application or service that has privileged data. If you’re not locked down behind a flawless perimeter that has only privileged users inside, you need to have access controls in place. And because every perimeter can be assumed to have an untrusted actor somewhere within it, access controls are a top requirement even if you aren’t exposing your work to the public.

How can I address this in Python?

This has a close tie-in to #4 (Insecure Design) because how you design your access controls is a key element in making sure they are well-suited for the situation and not prone to breakage. Much like a physical building that can use keys, keycards, biometrics, or other different types of access controls, choosing an option that fits your situation is important. From JSON Web Token (JWT) to role-based access control (RBAC), and a variety of other options, it’s important to ensure that your project contributors can understand and maintain every nuance to your access controls.

The recommendations from OWASP are immediately useful for application developers:

  • Except for public resources, deny by default.
  • Implement access control mechanisms once and re-use them throughout the application, including minimizing Cross-Origin Resource Sharing (CORS) usage.
  • Model access controls should enforce record ownership rather than accepting that the user can create, read, update, or delete any record.
  • Unique application business limit requirements should be enforced by domain models.
  • Disable web server directory listing and ensure file metadata (e.g., .git) and backup files are not present within web roots.
  • Log access control failures, alert admins when appropriate (e.g., repeated failures).
  • Rate limit API and controller access to minimize the harm from automated attack tooling.
  • Stateful session identifiers should be invalidated on the server after logout. Stateless JWT tokens should rather be short-lived so that the window of opportunity for an attacker is minimized. For longer lived JWTs it's highly recommended to follow the OAuth standards to revoke access.

I decided not to provide a code example here, because it’s important that you turn a critical eye to properly implementing whatever design is best for your situation.

#2 - Cryptographic Failures

The first thing is to determine the protection needs of data in transit and at rest. For example, passwords, credit card numbers, health records, personal information, and business secrets require extra protection, mainly if that data falls under privacy laws, e.g., EU's General Data Protection Regulation (GDPR), or regulations, e.g., financial data protection such as PCI Data Security Standard (PCI DSS).

A02:2021-Cryptographic Failures

When does this apply to a Pythonista?

If you are creating an application that handles or stores sensitive data, you should consider whether that data should be encrypted. Check out the Azure data security and encryption best practices. Also check out the OWASP page for this entry; there is a lot to consider on this topic.

At the end of the day, Encryption at Rest is our responsibility if we’re the last ones to touch the data before it’s stored – even if we think we can rely on our infrastructure to encrypt the data. Encryption in Transit is always our responsibility when we build tools that handle sensitive data.

How can I address this in Python?

Depending on your stack, you may find some helpful topical libraries, such as the Amazon DynamoDB Encryption Client that help encrypt the entire payload before it is sent to the database. However, it is also recommended that we encrypt our data as soon as it arrives. The last thing we want to do is forget about the state of encryption for an element that ends up getting piped to a log, and end up in noncompliance with some regulation because of an overzealous logger.

When I was first starting out in Python, bcrypt was how I did encryption. That package is still acceptable, but the maintainers of that package recommend using scrypt or Argon2. Argon2 is newer and has some minor advantages, but the two have general parity. The core library currently implements scrypt within hashlib, and scrypt is also used by the cryptography package. If you’re considering dependency minimization, go with hashlib. If you’re doing encryption everywhere and need a bit of a speed boost, go with Argon2.

Again, because this is such an easy thing to accomplish, my key recommendation for you is to encrypt instantly after getting sensitive data:

import argon2


hashed_password = argon2.hash_password(password)

#3 - Injection

Some of the more common injections are SQL, NoSQL, OS command, Object Relational Mapping (ORM), LDAP, and Expression Language (EL) or Object Graph Navigation Library (OGNL) injection. The concept is identical among all interpreters.

A03:2021-Injection

When does this apply to a Pythonista?

If you are accepting user input, this applies. There is SQL injection to rightfully consider when accepting user input, but perhaps more importantly – because this is too often overlooked – if you are reading data that was stored by something other than your codebase, this applies then too. Refer to #8 for more on that topic.

How can I address this in Python?

Input validation is the goal here, but it depends on what you expect the user-provided content to be. Generally speaking, you won’t be dealing with any completely-free form inputs… you’ll at least know what type of information you expect it to be. Whether it’s simple like a name or email address, or something more complex like HTML or Markdown, you at least know what type the data will be. If you don’t, you’ve likely got a design problem on your hands. Since you’ve designed your processes properly and know what type the data should conform to, it’s a simple matter to ensure it conforms to that type before you use it for something else.

import markdown


def safe_markdown(untrusted)
    return markdown.markdown(untrusted, safe_mode=True, enable_attributes=False)

#4 - Insecure Design

Insecure design is a broad category representing different weaknesses, expressed as “missing or ineffective control design.”

A04:2021-Insecure Design

When does this apply to a Pythonista?

Always. Just… Absolutely always.

How can I address this in Python?

As I highlighted in some other sections in this article, insecure design is a breeding ground for bugs and vulnerabilities. But because the concept is a bit vague, I’ll throw down some notes of things we can look for. These are either quotes or inspired by the OWASP link above.

  • Spaghetti code introduces risk.
  • Establish and use a library of secure design patterns or paved road ready to use components.
  • DRY (Don't Repeat Yourself) is sometimes a preference that gets taken too far, but it is often also a security measure. Simple code has fewer opportunities to introduce risk.
  • Integrate plausibility checks at each tier of your application (from frontend to backend)
  • …and add fuzzing wherever possible.
  • Write unit and integration tests to validate that all critical flows are resistant to the threat model. Compile use-cases and misuse-cases for each tier of your application.
  • Segregate tier layers on the system and network layers depending on the exposure and protection needs.
  • Segregate tenants robustly by design throughout all tiers (RBAC).
  • Limit resource consumption by user or service.

#5 - Security Misconfigurations

Without a concerted, repeatable application security configuration process, systems are at a higher risk.

A05:2021-Security Misconfiguration

When does this apply to a Pythonista?

This one is really similar to #4, in that it permeates anything we’re doing.

Even if you’re not working on an application or service that is intended to be used by other people – like an integration built just for you or your team – we are introducing risk to ourselves and our organization every time we write code using an open source language like Python. In many cases, the risk is easily mitigated by tools such as an internally managed package repository…But if you find a loophole to bypass that source (intentionally or unintentionally), you could easily end up with malware that targets your developer environment or the entire organization. This is just one example of how we as developers can be the cause of a mishap due to a security misconfiguration.

How can I address this in Python?

Understand your configuration options, everywhere they exist. The easy path to progress is quick and dirty – follow a simple tutorial to get your tools running, then move on to the next task. But that quick flow is a trickster, constantly betraying us as we blindly introduce risk in the name of progress.

Here is a quick and incomplete list of places you should look for configurable settings or unnecessary features that may impact your security posture:

  • Application servers
  • Application frameworks
  • Libraries
  • Databases
  • Cloud Services
  • Network activities
  • Error handling & logging

Security configuration in our environment and application is closely related to several other items on this list, such as #6: If your company has a registry with dependency scanning in place, ensure that your environment is not mistakenly pulling from the wild wild web (I’ve been there).

Here is a quick example of a pip configuration that will allow you to pull from the proper package repository.

[global]
index = http://localhost:8081/repository/pypi-all/pypi
index-url = http://localhost:8081/repository/pypi-all/simple
cert = nexus.pem


[easy_install]
index-url = http://localhost:8081/repository/pypi-proxy/simple

Reach out to someone in your org to get the right paths…or ask them to just send you their config to copy!

#6 - Vulnerable and Outdated Components

Every organization must ensure an ongoing plan for monitoring, triaging, and applying updates or configuration changes for the lifetime of the application or portfolio.

A06:2021-Vulnerable and Outdated Components

When does this apply to a Pythonista?

It instantly applies as we discuss our Python version, and further applies to every dependency. Got a big requirements.txt? Every line you add there is another risk you’re introducing to yourself and your organization.

How can I address this in Python?

First up, be paranoid. No amount of paranoia is too much, when it comes to open source dependencies. Check out the insane volume of attacks that are happening through this vector every month.

Code is like eggs and milk: they’re part of virtually every tasty recipe, but they expire. In the case of open source software, vulnerabilities are eventually found and exploits are made available to all the baddies.

But on the other hand, there is risk associated with using the literal latest version of a tool. If you’re not very intentional about accepting the latest version, research has shown that there are negative impacts to “living on the edge,” which can break builds and end up discouraging proper dependency hygiene in the long run. Instead of using >= to take the latest version, pin your dependencies with == like so:

# requirements.txt
flask==2.2.2

While even living on the edge is preferable to “living in disarray,” it is recommended that developers at larger organizations take advantage of tools such as Sonatype’s Lifecycle to get live recommendations related to your dependencies. Most large organizations will already have something like this available… make sure you’re plugged in and using it to the fullest.

This definitely doesn’t apply to you, because you’re a responsible developer…but the metrics tell me that someone out there needs to hear the following: There are still plenty of developers using unsupported versions that the Python Software Foundation have deprecated (that means there are no more security updates). Python 2 is the worst culprit, because of the perceived difficulty of upgrading. Fortunately, there are docs with information on how to port code from Python 2 to Python 3.

#7 - Identification and Authentication Failures

Confirmation of the user's identity, authentication, and session management is critical to protect against authentication-related attacks.

A07:2021-Identification and Authentication Failures

When does this apply to a Pythonista?

If some of your features are intended to have limited access, or handle sensitive data. There are several things to watch out for. Again, this is an incomplete list, featuring a mix of my thoughts and information from the OWASP page linked above.

  • Session identifiers in the URL (like a user ID)
  • Failed session invalidation
  • Allowing insecure passwords
  • Allow infinite password attempts (brute forcing)
  • No MFA where you could have MFA
  • Passwords aren’t properly encrypted
  • Weak passwords are allowed

How can I address this in Python?

Frameworks like Django or Flask will have defaults in place for session invalidation, so just refer to point #5 and make sure you aren’t messing up that configuration. To limit password attempts, I recommend you design something in your system that will track login attempts by IP or user ID.

You can easily create custom logic instead of relying on a dependency to enforce password length and complexity requirements. Here is a simplified example that you can put into your password creation logic:

import string


def validate_complexity(password):
    specials = set(string.punctuation)
    if len(password) < 12:
        return false
    for character in specials:
        if character in password:
            return true
    return false

To set up multi-factor authentication, you can use pyotp for your own solution, or to integrate with Google Authenticator. Here is an example from their docs to create a Time-based One Time Password (TOTP):

import pyotp
import time


totp = pyotp.TOTP('base32secret3232')
totp.now() # => '492039'


# OTP verified for current time
totp.verify('492039') # => True
time.sleep(30)
totp.verify('492039') # => False

#8 - Software and Data Integrity Failures

Software and data integrity failures relate to code and infrastructure that does not protect against integrity violations. An example of this is where an application relies upon plugins, libraries, or modules from untrusted sources, repositories, and content delivery networks (CDNs).

A08:2021-Software and Data Integrity Failures

When does this apply to a Pythonista?

Integrity failures and violations are a bit of a dense topic, covering a variety of different things within the software supply chain (SSC). The SSC starts at the moment one of our dependencies is packaged, all the way until it’s deployed by us as part of our code. That includes any steps taken to make the dependency available to us, get it to our dev environment, and build it into our final release.

Another place that this applies is in network transactions and data serialization, again ensuring the integrity of the elements as we work with them.

The last element I want to highlight in this section is data deserialization. We programmers can sometimes trust our databases too much, causing us to pull data out and just start using it immediately, even when that database is used by other applications. In those situations, we can encounter data that only went through some basic SQL sanitization before being input… but the contents may contain malicious code that is targeting a system like the one our code is running on. This is one way to get yourself on the smelly end of a buffer overflow or Remote Code Execution (RCE).

How can I address this in Python?

Dependency testing is also something that we should do in robust deliverables, but it’s pretty uncommon due to the complexity and low-likeliness of it catching anything (though when it does catch something, it’s really helpful). An example of a dependency test is to create a unit test that simply validates a common set of features from that dependency. Something like the code below will allow us to validate that nothing has broken in the way we expect the package to work. A failure could imply malicious activity or (more likely) a breaking change in the upstream.

# test_dependencies.py
import pyotp


def test_pyotp():
    totp = pyotp.TOTP('base32secret3232')
    assert isinstance(totp.now(), str)

Creating a secure HTTP cookie is the recommended approach for ensuring integrity for HTTP/s activities. The below example is very abbreviated from the requests documentation:

def create_cookie(name, value, **kwargs):
    result = {
        "secure": False,
    }
    return cookielib.Cookie(**result)

The most risky type of behavior on this topic is for you to run a subcommand, where the risk of RCE is introduced. The first (obvious) mitigation is to just not run subcommands, and instead use specialized logic that will do data validation for us. But in the below example, we have an even greater risk because we are getting user data from a multi-purpose database before using that data in the subcommand. To mitigate risk, we are validating that the content is just a simple name before using it in the command.

import string


def validate(input):
    invalid_characters = set(string.punctuation)
    for character in invalid_characters:
        if character in input:
            return false
    return true




name = get_user_name_from_db()
name_is_safe = validate(name)

#9 - Security Logging and Monitoring Failures

This category is to help detect, escalate, and respond to active breaches. Without logging and monitoring, breaches cannot be detected.

A09:2021-Security Logging and Monitoring Failures

When does this apply to a Pythonista?

If the stuff you’re building is deployed to more users than just your own team, this category applies to you at least a little bit. This is mostly concerned with operations elements beyond the scope of our development process, but it’s important to consider how we are handling logging and monitoring. To some degree, this will overlap with #4, Insecure design, because we want to consider how our application handles logs in a way that is both useful and selective.

How can I address this in Python?

Make sure to consider how your software is being deployed, and where your logs will be sent. If you’ve got a containerized application, you’ll want to turn a careful eye to anything that could cause the container to restart– and make sure the logs in those places are useful for your SRE, DevOps, or whoever is maintaining the deployment. At the same time, you want to turn a critical eye to your log formatting if your logs are being sent along to something like Splunk or Sentry.

The last big thing is related to #2 - Cryptographic failures: Make sure you aren’t somehow logging sensitive information. If you’re taking my advice and instantly encrypting all sensitive information, you likely won’t have this issue. But to be on the safe side, ensure that you never do something like this (it’s wrong for a lot of reasons, but mostly because we’re logging the password):

try:
    my_system.login(username, password)
except e as Exception:
    logger.warning(f"Failed to login {username} with {password}. Reason: {e}")

#10 - Server-Side Request Forgery

SSRF flaws occur whenever a web application is fetching a remote resource without validating the user-supplied URL. It allows an attacker to coerce the application to send a crafted request to an unexpected destination, even when protected by a firewall, VPN, or another type of network access control list (ACL).

A10:2021-Server-Side Request Forgery

When does this apply to a Pythonista?

This is a really, really specific type of issue to run into, and it requires you to be in a situation where user-supplied data is being used in a request made from your application. According to OWASP, this is not a data-driven entry to the Top-10– rather, it was added to the list based on feedback from their contributors and community members.

How can I address this in Python?

If you’re accepting user input to determine your target path, make sure to sanitize the input and use an allow-list for everything you can, such as the port. Also be sure to prevent redirects, so you don’t end up getting sent on an unexpected journey. This is really simple if you’re using the core requests library:

import requests


response = requests.get(“https://google.com”, allow_redirects=False)

Conclusion

It's not uncommon to feel overwhelmed or dismissive of security checklists like the OWASP Top 10. However, as we've seen, these lists are both manageable and relevant. In fact, some of the issues on the list, such as improper dependency management, can have a direct impact on your personal security. By taking the time to understand and address these issues, you can better protect yourself and your organization from potential threats.

Hopefully, this walkthrough of the OWASP Top 10 has shown you that security best practices don't have to be boring or daunting. With the right mindset and approach, anyone can take a proactive role in securing their systems and applications. So, whether you're a developer, a security professional, or just someone interested in staying safe online, I encourage you to explore similar material and continue learning about this important topic!


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK