Info
ID: AT-RE002.002
Technique: Application Dependencies Mapping
Tactic: Reconnaissance
Platforms: PRE
Version: 1.0
Package Manifest Scraping
Package Manifest Scraping is a reconnaissance technique where attackers analyze application dependency files (like package.json
, requirements.txt
, pom.xml
, Gemfile
, or composer.json
) to identify software components, their versions, and potential vulnerabilities. During the Application Dependencies Mapping phase of reconnaissance, attackers extract these manifests from accessible repositories, websites, or exposed configuration files to build a comprehensive understanding of the target application's technology stack. By examining these files, adversaries can pinpoint outdated libraries with known security vulnerabilities (CVEs), determine framework versions that may contain exploitable flaws, and identify dependencies that could be targeted for supply chain attacks. This intelligence gathering technique requires minimal interaction with the target system and often leverages publicly available information, making it difficult to detect while providing attackers with valuable insights for planning subsequent phases of their attack campaign.
Data Sources
- Public Repositories: Dependency manifests in GitHub, GitLab, and other version control systems
- Package Registries: Package metadata from npm, PyPI, Maven Central, and other repositories
- API Logs: Access logs from repository and package registry APIs
- Network Traffic: HTTP requests to package management endpoints
Mitigations
ID | Mitigation | Description |
---|---|---|
M1013 | Application Developer Guidance | Limit sensitive information in publicly accessible dependency manifests |
M1021 | Restrict Web-Based Content | Implement access controls on repositories containing sensitive dependency information |
M1017 | User Training | Train developers on risks of exposing detailed dependency information publicly |
Detection
Package-manifest scraping usually occurs on public SCM platforms or registry APIs an organisation does not control. Real-time enterprise detection is therefore limited.
Potential visibility avenues include:
- Hosted-platform audit logs (GitHub/GitLab, Bitbucket) that show excessive clone or raw-file downloads when the repo is under the organisation’s control.
- External threat-intel flagging large-scale scraping of project manifests tied to the brand.
- Post-reconnaissance indicators such as phishing or exploit payloads referencing library versions harvested from the manifests.