Affected Platforms

Extended Description

AI Translation

Input validation is a frequently-used technique for checking potentially dangerous inputs in order to ensure that the inputs are safe for processing within the code, or when communicating with other components.

Input can consist of:

- raw data - strings, numbers, parameters, file contents, etc.

- metadata - information about the raw data, such as headers or size

Data can be simple or structured. Structured data can be composed of many nested layers, composed of combinations of metadata and raw data, with other simple or structured data.

Many properties of raw data or metadata may need to be validated upon entry into the code, such as:

- specified quantities such as size, length, frequency, price, rate, number of operations, time, etc.

- implied or derived quantities, such as the actual size of a file instead of a specified size

- indexes, offsets, or positions into more complex data structures

- symbolic keys or other elements into hash tables, associative arrays, etc.

- well-formedness, i.e. syntactic correctness - compliance with expected syntax

- lexical token correctness - compliance with rules for what is treated as a token

- specified or derived type - the actual type of the input (or what the input appears to be)

- consistency - between individual data elements, between raw data and metadata, between references, etc.

- conformance to domain-specific rules, e.g. business logic

- equivalence - ensuring that equivalent inputs are treated the same

- authenticity, ownership, or other attestations about the input, e.g. a cryptographic signature to prove the source of the data

Implied or derived properties of data must often be calculated or inferred by the code itself. Errors in deriving properties may be considered a contributing factor to improper input validation.

AI Generated Translation

La validazione dell'input è una tecnica frequentemente utilizzata per verificare gli input potenzialmente pericolosi al fine di garantire che siano sicuri per l'elaborazione all'interno del codice o durante la comunicazione con altri componenti.

Gli input possono consistere in:

- dati grezzi - stringhe, numeri, parametri, contenuti di file, ecc.

- metadati - informazioni sui dati grezzi, come intestazioni o dimensioni

I dati possono essere semplici o strutturati. I dati strutturati possono essere composti da molti livelli annidati, costituiti da combinazioni di metadati e dati grezzi, con altri dati semplici o strutturati.

Molte proprietà dei dati grezzi o dei metadati devono essere validate all'ingresso nel sistema, come ad esempio:

- quantità specificate come dimensione, lunghezza, frequenza, prezzo, tasso, numero di operazioni, tempo, ecc.

- quantità implicite o derivate, come la reale dimensione di un file invece di quella specificata

- indici, offset o posizioni all’interno di strutture dati più complesse

- chiavi simboliche o altri elementi all’interno di tabelle hash, array associativi, ecc.

- correttezza formale, ovvero correttezza sintattica - conformità alla sintassi attesa

- correttezza dei token lessicali - conformità alle regole di trattamento come token

- tipo specificato o derivato - il tipo effettivo dell’input (o quello che l’input sembra essere)

- coerenza - tra singoli elementi di dati, tra dati grezzi e metadati, tra riferimenti, ecc.

- conformità alle regole specifiche del dominio, ad esempio logica di business

- equivalenza - garantire che input equivalenti siano trattati allo stesso modo

- autenticità, proprietà o altre attestazioni sull’input, ad esempio una firma crittografica per dimostrare la provenienza dei dati

Le proprietà implicite o derivate dei dati devono spesso essere calcolate o dedotte dal codice stesso. Errori nel derivare le proprietà possono essere considerati un fattore che contribuisce a una validazione inadeguata dell’input.

Technical Details

AI Translation

Common Consequences

availability confidentiality integrity

Impacts

dos: crash, exit, or restart dos: resource consumption (cpu) dos: resource consumption (memory) read memory read files or directories modify memory execute unauthorized code or commands

Detection Methods

automated static analysis manual static analysis fuzzing automated static analysis - binary or bytecode manual static analysis - binary or bytecode dynamic analysis with automated results interpretation dynamic analysis with manual results interpretation manual static analysis - source code automated static analysis - source code architecture or design review

Potential Mitigations

Phases:

architecture and design implementation

Descriptions:

• Consider using language-theoretic security (LangSec) techniques that characterize inputs using a formal language and build "recognizers" for that language. This effectively requires parsing to be a distinct layer that effectively enforces a boundary between raw input and internal data representations, instead of allowing parser code to be scattered throughout the program, where it could be subject to errors or inconsistencies that create weaknesses. [REF-1109] [REF-1110] [REF-1111]

• Use an input validation framework such as Struts or the OWASP ESAPI Validation API. Note that using a framework does not automatically address all input validation problems; be mindful of weaknesses that could arise from misusing the framework itself (CWE-1173).

• Understand all the potential areas where untrusted inputs can enter the product, including but not limited to: parameters or arguments, cookies, anything read from the network, environment variables, reverse DNS lookups, query results, request headers, URL components, e-mail, files, filenames, databases, and any external systems that provide data to the application. Remember that such inputs may be obtained indirectly through API calls.

• Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.

• For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server. Even though client-side checks provide minimal benefits with respect to server-side security, they are still useful. First, they can support intrusion detection. If the server receives input that should have been rejected by the client, then it may be an indication of an attack. Second, client-side error-checking can provide helpful feedback to the user about the expectations for valid input. Third, there may be a reduction in server-side processing time for accidental input errors, although this is typically a small savings.

• Be especially careful to validate all input when invoking code that crosses language boundaries, such as from an interpreted language to native code. This could create an unexpected interaction between the language boundaries. Ensure that you are not violating any of the expectations of the language with which you are interfacing. For example, even though Java may not be susceptible to buffer overflows, providing a large argument in a call to native code might trigger an overflow.

• When your application combines data from multiple sources, perform the validation after the sources have been combined. The individual data elements may pass the validation step but violate the intended restrictions after they have been combined.

• Directly convert your input type into the expected data type, such as using a conversion function that translates a string into a number. After converting to the expected data type, ensure that the input's values fall within the expected range of allowable values and that multi-field consistencies are maintained.

• When exchanging data between components, ensure that both components are using the same character encoding. Ensure that the proper encoding is applied at each interface. Explicitly set the encoding you are using whenever the protocol allows you to do so.

• Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180, CWE-181). Make sure that your application does not inadvertently decode the same input twice (CWE-174). Such errors could be used to bypass allowlist schemes by introducing dangerous inputs after they have been checked. Use libraries such as the OWASP ESAPI Canonicalization control. Consider performing repeated canonicalization until your input does not change any more. This will avoid double-decoding and similar scenarios, but it might inadvertently modify inputs that are allowed to contain properly-encoded dangerous content.

AI Generated Translation

Common Consequences

disponibilità riservatezza integrità

Impacts

dos: crash, uscita o riavvio dos: consumo di risorse (cpu) dos: consumo di risorse (memoria) leggere memoria leggere file o directory modificare memoria eseguire codice o comandi non autorizzati

Detection Methods

analisi statica automatizzata analisi statica manuale fuzzing analisi statica automatizzata - binario o bytecode analisi statica manuale - binario o bytecode analisi dinamica con interpretazione automatica dei risultati analisi dinamica con interpretazione manuale dei risultati analisi statica manuale - codice sorgente analisi statica automatizzata - codice sorgente revisione dell'architettura o del design

Potential Mitigations

Phases:

architettura e design implementazione

Descriptions:

• Considera l'uso di tecniche di security basate sulla teoria del linguaggio (LangSec) che caratterizzano gli input utilizzando un linguaggio formale e costruiscono "recognizer" per tale linguaggio. Ciò richiede efficacemente che il parsing sia un livello distinto che applica efficacemente un confine tra input grezzo e rappresentazioni interne dei dati, invece di consentire che il codice del parser sia sparso in tutto il programma, dove potrebbe essere soggetto a errori o incoerenze che creano vulnerabilità. [REF-1109] [REF-1110] [REF-1111]

• Utilizzare un framework di convalida degli input come Struts o l'OWASP ESAPI Validation API. Si noti che l'uso di un framework non risolve automaticamente tutti i problemi di convalida degli input; essere consapevoli delle vulnerabilità che potrebbero derivare da un uso improprio del framework stesso (CWE-1173).

• Comprendere tutte le potenziali aree in cui input non affidabili possono entrare nel prodotto, inclusi ma non limitati a: parametri o argomenti, cookie, qualsiasi dato letto dalla rete, variabili d'ambiente, reverse DNS lookups, risultati delle query, intestazioni delle richieste, componenti dell'URL, e-mail, file, nomi di file, database e qualsiasi sistema esterno che fornisce dati all'applicazione. Ricordare che tali input possono essere ottenuti indirettamente tramite chiamate API.

• Assumi che tutti gli input siano dannosi. Utilizza una strategia di convalida degli input "accetta solo quelli noti come buoni", ovvero utilizza una lista di input accettabili che conformano strettamente alle specifiche. Rifiuta qualsiasi input che non rispetti rigorosamente le specifiche, o trasformalo in qualcosa che lo faccia. Quando esegui la convalida degli input, considera tutte le proprietà potenzialmente rilevanti, inclusi lunghezza, tipo di input, l'intera gamma di valori accettabili, input mancanti o in eccesso, sintassi, coerenza tra campi correlati e conformità alle regole di business. Come esempio di logica di regole di business, "boat" può essere sintatticamente valido perché contiene solo caratteri alfanumerici, ma non è valido se l'input è previsto per contenere solo colori come "red" o "blue". Non fare affidamento esclusivamente sulla ricerca di input dannosi o malformati. Questo potrebbe non individuare almeno un input indesiderato, specialmente se l'ambiente del codice cambia. Questo può dare agli attaccanti abbastanza spazio per aggirare la validazione prevista. Tuttavia, le liste di esclusione (denylists) possono essere utili per rilevare potenziali attacchi o determinare quali input sono così malformati da dover essere rifiutati immediatamente.

• Per qualsiasi verifica di sicurezza eseguita sul lato client, assicurarsi che tali verifiche siano duplicate sul lato server, al fine di evitare CWE-602. Gli attaccanti possono bypassare le verifiche lato client modificando i valori dopo che sono stati eseguiti i controlli, oppure modificando il client per rimuovere completamente le verifiche lato client. Successivamente, questi valori modificati verrebbero inviati al server. Anche se le verifiche lato client offrono benefici minimi rispetto alla sicurezza lato server, sono comunque utili. Innanzitutto, possono supportare il rilevamento delle intrusioni. Se il server riceve input che avrebbero dovuto essere respinti dal client, potrebbe essere un'indicazione di un attacco. In secondo luogo, il controllo degli errori lato client può fornire un feedback utile all'utente riguardo alle aspettative per un input valido. Terzo, potrebbe esserci una riduzione del tempo di elaborazione lato server per errori di input accidentali, anche se questa è generalmente una piccola economia.

• Sii particolarmente attento a convalidare tutti gli input quando invochi codice che attraversa i confini tra linguaggi, come ad esempio da un linguaggio interpretato a codice nativo. Questo potrebbe creare un'interazione imprevista tra i confini dei linguaggi. Assicurati di non violare le aspettative del linguaggio con cui stai interfacciando. Ad esempio, anche se Java potrebbe non essere suscettibile a buffer overflow, fornire un argomento di grandi dimensioni in una chiamata a codice nativo potrebbe scatenare un overflow.

• Quando la tua applicazione combina dati provenienti da più fonti, esegui la validazione dopo aver combinato le fonti. Gli elementi di dati individuali potrebbero superare la fase di validazione ma violare le restrizioni previste una volta combinati.

• Converti direttamente il tipo di input nel tipo di dato previsto, ad esempio utilizzando una funzione di conversione che trasforma una stringa in un numero. Dopo aver convertito nel tipo di dato atteso, verifica che i valori dell'input rientrino nell'intervallo consentito e che siano mantenute le coerenze tra più campi.

• Quando si scambiano dati tra componenti, assicurarsi che entrambi i componenti utilizzino la stessa codifica dei caratteri. Garantire che la codifica corretta venga applicata a ogni interfaccia. Impostare esplicitamente la codifica che si sta utilizzando ogni volta che il protocollo consente di farlo.

• Gli input dovrebbero essere decodificati e canonicalizzati alla rappresentazione interna corrente dell'applicazione prima di essere validati (CWE-180, CWE-181). Assicurati che la tua applicazione non decodifichi involontariamente lo stesso input più di una volta (CWE-174). Tali errori potrebbero essere sfruttati per aggirare gli schemi di allowlist introducendo input pericolosi dopo che sono stati verificati. Utilizza librerie come il controllo di Canonicalization di OWASP ESAPI. Considera di eseguire una canonicalizzazione ripetuta fino a quando il tuo input non cambia più. Questo eviterà decodifiche doppie e scenari simili, ma potrebbe involontariamente modificare input che sono consentiti contenere contenuti pericolosi correttamente codificati.

CWE-20

Common Consequences

Impacts

Detection Methods

Potential Mitigations

Common Consequences

Impacts

Detection Methods

Potential Mitigations

Iscriviti alla newsletter