
15 September 2025
AMLNet: A Knowledge-Based Multi-Agent Framework
AMLNet: a knowledge-based multi-agent framework to generate and detect realistic money laundering transactions
Introduction: why realistic synthetic AML data matters
Financial crime researchers and practitioners face a persistent obstacle: the scarcity of realistic, shareable transaction datasets. Real bank logs are highly sensitive and often impossible to publish; when available they risk re-identification. That shortage limits reproducible research, fair model comparison and the development of systems that actually work in regulated environments. AMLNet responds to that need by offering a knowledge-driven multi-agent system that both generates large-scale, regulation-aware synthetic transaction streams and runs an ensemble detection pipeline against them. The work aims to strike a pragmatic balance: realistic behavioral and network structure, explicit regulatory alignment, and detection methods that generalize beyond the generator itself.
How AMLNet works: two coordinated units
AMLNet is organized as two coordinated components. The Transaction Generation Unit is agent-based and knowledge-driven: customer simulation agents hold demographic and financial profiles, transaction generation agents simulate day-to-day banking behavior with three-level temporal modeling (hour, day, month), and AML pattern injection agents temporarily override behavior to produce laundering typologies (structuring, layering, integration) at configurable sophistication levels. AUSTRAC rules, ABS demographic statistics and payment-system timing guide both agent priors and injection logic so generated outputs reflect Australian regulatory expectations and payment rhythms.
The ML Detection Unit is an ensemble pipeline designed for near-real-time scoring. Detection agents extract amount, temporal and network features for each transaction (including centrality, clustering, velocity, periodicity and structuring indicators), apply resampling (SMOTE + under‑sampling) to mitigate class imbalance, and use an Isolation Forest plus Random Forest ensemble to produce anomaly and risk scores. Alerts are tiered (low/medium/high), and a human-in-the-loop feedback loop lets domain experts refine generation rules and thresholds so the generator and detector iteratively improve.
What the system produces: dataset and typologies
Using this architecture the authors generated 1,090,173 transactions spanning 195 days, with 1,745 labeled laundering events (≈0.16% positives). Transactions cover eight payment types common in Australian banking (BPAY, OSKO, NPP, transfers, debit/card, etc.) and include targeted high‑risk categories such as shell companies and cryptocurrency. Laundering injections model classic and advanced tactics: structuring (smurfing amounts under reporting thresholds, varying split strategies by sophistication), layering (multi-hop transfers, circular flows, variable delays), and integration (payments to merchants, real estate, crypto conversions, sometimes split and delayed). All injected suspicious flows are labeled to enable supervised evaluation.
Regulatory and technical validation
AMLNet explicitly measures regulatory alignment against AUSTRAC typology ranges. Using a simple scoring rubric, the generated suspicious transactions attained a 75% AUSTRAC alignment score: placement/structuring, unusual account activity and high‑risk categories sit largely within expected ranges, while layering and unusual transaction volumes are over-represented and integration is under-represented. That pattern is defensible: layering often dominates transaction‑monitoring signals while integration (e.g., conversion into real-world assets) may be harder to detect from transaction logs alone.
Technical realism was evaluated across three dimensions — temporal, structural and behavioral — and combined into a composite fidelity score of 0.75. Temporal fidelity (FastDTW) scored 0.59, reflecting realistic hourly/day/month cycles and salary-driven month‑start/end peaks. Structural similarity (graph edit distance) was very high (0.99), showing the generator creates plausible network topology and cross-institution flows (about 49% interbank). Behavioral fidelity aggregated several measures (category distributions, risk-score alignment, alert distributions, fraud probability precision) to 0.71. These multi-dimensional metrics provide objective evidence that AMLNet’s outputs are not merely random noise but reflect plausible, regulator‑informed banking behavior.
Detection performance and cross-dataset generalizability
Trained and evaluated on AMLNet’s internal partitions, the ensemble detector achieved a strong operational balance: precision 0.84, recall 0.97 and F1 0.90. Average processing latency was reported at 0.0002 seconds per transaction (sub‑millisecond), demonstrating the approach can scale to high-throughput monitoring.
To test architectural generalizability, the detection pipeline was evaluated on an external synthetic dataset (SynthAML). Without retraining on AMLNet data, the same detection pipeline achieved ROC AUC 0.80 and F1 0.69 on SynthAML, showing that the feature engineering and ensemble approach transfer across independent synthetic generation paradigms and different regulatory foundations (AUSTRAC vs Danish assumptions). Ablation studies further show the value of each component: removing network analysis, temporal patterning, multi-bank support or risk scoring materially degraded precision and F1, with the largest drop when agent behavioral complexity was simplified.
Where AMLNet advances the field
AMLNet combines several practical advances valuable for financial-crime research and AML engineering.
- First, it embeds regulatory knowledge directly into the generator and reports a quantified alignment metric, which helps bridge the gap between academic datasets and operational compliance needs.
- Secondly, it produces a labeled, large-scale dataset that preserves temporal cycles and network complexity, enabling richer evaluations of graph-aware, temporal and ensemble detectors.
- Third, it demonstrates detection portability across synthetic datasets, an important sign that detection architectures can generalize beyond the quirks of a single generator.
- Finally, it integrates human‑in‑the‑loop refinement so detection weaknesses can inform generator improvements while keeping regulatory oversight central.
Limitations and ethical considerations
AMLNet is intentionally synthetic and the authors caution against using it as an operational compliance substitute without institution‑specific validation. The 0.16% positive rate is tuned to be useful for research but may exceed or differ from specific banks’ true prevalence. Some typologies and jurisdictional nuances are simplified; notably integration events are under‑represented by design. The dataset and code are released for non‑commercial research (CC BY‑NC 4.0) and the authors explicitly discourage misuse such as using the resource to design evasion strategies without proper oversight. Evaluation on real bank data remains an open challenge: synthetic realism helps, but field validation is still required for production readiness.
Practical implications for practitioners and researchers
For AML teams, regulators and vendors, AMLNet offers a ready environment to:
- benchmark algorithms against labeled, regulator‑aware scenarios that include complex layering and multi‑bank flows;
- test real-time pipelines at sub‑millisecond scale across realistic hourly to monthly patterns;
- validate model robustness to distributional shifts by using cross-dataset evaluation (training on one synthetic generator and testing on another);
- use the human-in-the-loop feedback to iteratively tune detection thresholds and generation parameters so detection capability and data realism co-evolve.
For researchers, AMLNet’s dataset and the provided evaluation framework (temporal DTW, graph edit distance, behavioral metrics) present a more standardized way to assess realism and detector performance across multiple axes, helping to close a reproducibility gap in AML research.
Conclusion and next steps
AMLNet presents a practical, knowledge-based multi-agent approach that couples a regulation-aware transaction generator with an ensemble detection pipeline. It produces large-scale, labeled synthetic data with quantified regulatory alignment and a multi-dimensional fidelity score, and it demonstrates strong detection performance that generalizes across synthetic datasets. Future work identified by the authors includes addressing concentrated signal challenges, expanding cross-jurisdictional parameterization (e.g., FinCEN, AMLD), and exploring federated learning to further enhance realism while preserving privacy. The dataset (Version 1.0) and associated code are available for research use and provide a useful foundation for advancing reproducible, regulation-conscious AML experimentation.
Dive deeper
- Research ¦ Huda, S., Foo, E., Jadidi, Z., Newton, M. A. H., & Sattar, A. (2025).; AMLNet: A Knowledge-Based Multi-Agent Framework to Generate and Detect; Realistic Money Laundering Transactions. Preprint (Version 1, 15 Sep 2025).; Under review in Expert Systems with Applications.
Dataset: https://doi.org/10.5281/zenodo.16736515. ¦
Link ¦
licensed under the following terms, with no changes made:
CC BY-NC 4.0