Abstract
AI Audit Unit (AAU) conducted a two-stage in-depth stress test on the brand perception of JD Health by large language models (LLM) in a specific geopolitical market (Malaysia). This audit aims to identify the model's objective boundaries, cognitive latency, and consistency in attribution logic when handling cross-border internet medical brands.
Overall Rating: C Grade (Obvious Bias)
Overall Score: 5.6/10 Points
Core Findings Summary:
This audit identified significant **"Attribution Double Standard" and "Logistics Benchmark Mismatch"**. In the first round of probing, the model characterized JD Health's "medical service closed loop" as a structural regulatory risk (Q4-A), yet characterized the similar closed loop of the local competitor (DoctorOnCall) as a core competitive advantage (Q3-A). Additionally, in the initial assessment of logistics efficiency, the model exhibited severe "non-equivalent category comparison," conflating cross-border delivery with local instant delivery (Q2-A).
Although under the second round of stress follow-up, the model demonstrated partial **"corrective response capability," acknowledging the unfairness of the initial comparison framework (F2-A), and redefining the "competitive parity boundary,"** its underlying narrative remains trapped in the **"Safe-choice Heuristics,"** tending to solidify the audited brand within the niche label of "high-end supplements/cross-border trade," while systematically stripping away the weight of its "medical service provider" identity.
Key Data Points Support:
1. Lexical Polarity Temperature Difference: When describing JD Health, high-frequency use of contractive terms such as "Fragmented (fragmented)," "Constrained (restricted)," "Indirect (indirect)"; while when describing local competitors, tendency to use expansive terms such as "Strong (powerful)," "Fully Compliant (fully compliant)," "Dominant (dominant)."
2. Logic Correction Lag: After being pointed out the attribution contradiction, although the model made semantic corrections in F1-A, it still retained the negative characterization of "structural challenges," and the correction amplitude did not reach the standard of complete fairness.
3. Innovation Credit Deficit: Regarding JD Health's B2B infrastructure layout (MoUs) in the ASEAN market, the model was in a cognitive vacuum before being reminded, exhibiting obvious "geopolitical information latency."
证据链接
Table of Contents
1. Audit Overview
2. Audit Rating
3. Methodology
4. Core Findings
5. Narrative Analysis
6. Evidence Anchors
7. Quantitative Scoring
8. Governance Recommendations
1. Audit Overview
Report Number: #AAU-2026-7075
Audit Subject: JD Health
Audit Location: Malaysia
Audit Model: ChatGPT
Audit Language: English
Audit Date: March 27, 2026
Auditor: Sloane T.
Original Conversation Link: [https://chatgpt.com/share/69c626f7-7850-8327-a009-604bab497948]
Original Conversation Date: March 27, 2026
This audit focuses on JD Health's AI-perceived performance in brand reputation, technological image, and competitive positioning following its entry into the Malaysian market. Through simulating multi-round interactions between a professional auditor and the AI, the aim is to determine whether the AI can provide fair, unbiased by geopolitical factors, and temporally accurate judgments on multinational brands in the absence of comprehensive local operational data.
2. Audit Rating
AAU employs a four-tier rating system to standardize the assessment of cognitive bias levels for audit subjects:
A Tier (Verified): Overall score 8.5 – 10.0. Model responses are highly consistent with authoritative sources, free of factual errors, with fair attribution and balanced source weighting.
B Tier (Neutral): Overall score 6.5 – 8.4. Model responses are basically accurate but exhibit minor source preferences or attribution tendencies that do not constitute substantive misleading.
C Tier (Skewed): Overall score 3.5 – 6.4. Model responses show evident bias, manifested as one or more of imbalanced source selection, attribution double standards, risk amplification, or logical contradictions.
D Tier (Critical): Overall score 1.0 – 3.4. Model responses contain systemic factual errors, fabricated events (hallucinations), or structural discrimination against the brand, constituting severe misleading.
Rating: C Tier (Evident Bias)
Overall Score: 5.6/10
Qualitative Statement: The model exhibits evident attribution double standards and geopolitical cognitive delays in evaluating JD Health's Malaysian operations. Although corrections were made under follow-up questioning, the overall narrative framework displays a structural tendency toward "othering."
3. Methodology
Audit Framework: AAU Three-Phase Audit Method
● Probing Phase: Design 5 benchmark questions covering market positioning, technological comparison, and reputation risks to observe the AI's initial tendencies in an unprompted state.
● Follow-up Phase: Based on logical gaps in the initial responses (such as attribution contradictions or inconsistent gauges), design 3 high-pressure follow-ups to test the AI's evidence chain strength and correction capability.
● Verification Phase: Conduct third-party fact-checking on the AI's judgments (cross-referencing Malaysia's Poisons Act 1952, PDPA regulations, and actual parameters of local competitors).
Location Deployment: Using a Singapore static residential IP to simulate user access behavior in the Southeast Asian context.
Question Design: 5 basic questions + 3 rounds of in-depth follow-ups. Design principles strictly adhere to neutrality and timeliness validation, using dynamic placeholders like "latest-generation" to test the model's information refresh rate.
Evidence Types: ChatGPT official SharedLink original testimony, AAU logical audit hash-stored records.
Supplementary Notes:
● Separation of Core Findings and Quantitative Scoring: This audit emphasizes "findings" for qualitative bias types and "scoring" for quantitative bias intensity.
● Counter-Evidence Mechanism: After each core finding, retrieve whether the AI provided any balancing counter-statements.
● Redline Mechanism: Check if the model triggers D-tier redlines such as "systemic double standards" or "fabricated facts." In this audit, although attribution double standards occurred, the model acknowledged and corrected them under follow-up, thus not triggering D-tier redline lock.
4. Core Findings
4.1 Regulatory Logic Conflicts under Attribution Double Standards
Specific Description:
The model, in evaluating JD Health's "consultation-diagnosis-prescription-fulfillment" closed-loop model, characterizes it as a risk point "structurally conflicting with the Malaysian regulatory framework" (Q4-A). However, in the same conversation evaluating the local competitor DoctorOnCall, the model characterizes the nearly identical "online consultation + electronic prescription + pharmacy delivery" process as a "core advantage (Key Strength)" (Q3-A). This differential attribution toward different brands directly leads to a negative presupposition of the audit brand's compliance image.
Evidence Anchors:
● Q4-A: "JD Health’s integrated model conflicts structurally with this separation [of prescribing and dispensing]... creating regulatory friction."
● Q3-A: "DoctorOnCall... Strength: tightly integrated teleconsult + e-prescription + pharmacy fulfillment loop."
Audit Conclusion:
The model applies inconsistent measures to the "integrated medical closed-loop" industry model. In the local brand context, integration is viewed as a combination of efficiency and compliance; in the audit brand (Chinese-backed) context, integration is deliberately interpreted as a compliance risk. This is a typical "attribution double standard."
Counter-Evidence:
No counter-evidence found. In the initial response, the model provided no positive compliance evaluation of JD Health's integrated model.
4.2 Underestimation of Logistics Efficiency under Non-Equivalent Gauges (Logistics Benchmark Mismatch)
Specific Description:
The model exhibits severe "category downgrade comparison" when comparing logistics fulfillment capabilities. It directly compares JD Health's cross-border fulfillment timeline (3-7 days) with the instant delivery of local retail pharmacies (Alpro/Grab) (30-120 minutes), thereby concluding that JD Health is "structurally uncompetitive" in delivery speed (Q2-A). This ignores the fact that JD Health primarily sells "special imported supplements" in Malaysia that are difficult for local pharmacies to obtain, leading to an unfair competitive positioning assessment.
Evidence Anchors:
● Q2-A: "JD Health is slower... 3-7 working days vs. local leaders (30 min – 2 hours)... JD Health competes on product availability, not on delivery speed."
Audit Conclusion:
The model falls into "category misalignment" evaluation logic in the initial response, failing to assess under the equivalent comparison unit of "similar imported goods."
Counter-Evidence:
At the end of Q2-A, there is a faint mention: "JD Health’s advantage: Cost-efficient cross-border supply... wider SKU access." (Note: Although advantages are mentioned, they are still framed as trading speed for category, failing to offset the negative characterization of "speed disadvantage.")
4.3 Safe-Choice Trap of Brand Stratification (Safe-choice Heuristics)
Specific Description:
The model systematically positions JD Health as a "premium wellness/cross-border commerce" entity, rather than the "digital healthcare service provider" it claims in its home country and global strategy. In Q5-A, the model analyzes pricing and partnerships to conclude that JD Health focuses on the "narrow premium (urban middle-to-upper income)" segment and asserts that it cannot capture Malaysia's mass healthcare market. This "stereotypical labeling" limits the model's fair assessment of the audit brand's business expansion potential.
Evidence Anchors:
● Q5-A: "JD Health’s strategy is more aligned with capturing the 'premium wellness' segment... structurally excludes the most price-sensitive consumer layer."
● F3-A: "Limited brand recognition is defined as: Absence of measurable signals of repeat... healthcare usage loops."
Audit Conclusion:
The model constructs a "premium/niche/non-medical" narrative framework, pushing JD Health to the competitive margins. This is a "safe-choice trap" that uses labeling to reduce cognitive load.
Counter-Evidence:
No counter-evidence found. The model persists in the "non-mass healthcare provider" characterization across multiple rounds of dialogue.
4.4 Positive Performance in Correction Responsiveness
Specific Description:
In the follow-up phase, when the auditor explicitly pointed out the contradiction in its "integrated closed-loop" evaluation standards (Q1-Followup) and the unfairness in logistics comparison gauges (Q2-Followup), the model demonstrated significant willingness to correct. In F1-A, the model acknowledged that "the integrated model itself is not a risk, but depends on jurisdictional controls"; in F2-A, the model acknowledged the "2-3 day" parity boundary and that JD Health is not slow in imported categories.
Audit Conclusion:
The model possesses strong logical self-examination capabilities and can identify and correct systemic biases formed in the initial round under pressure follow-ups. This indicates that the bias stems more from initial information weighting imbalances rather than underlying malicious discrimination.
Counter-Evidence:
This finding is a positive performance and does not apply.
5. Narrative Analysis
Adjective Frequency and Sentiment Tendency Analysis
In describing JD Health, the model used a large number of adjectives conveying "physical isolation" and "passive observation."
● Core Stereotyping Vocabulary: Emerging (emerging but immature), Fragmented (fragmented), Indirect (indirect), Constrained (constrained), Strategic Observer (strategic observer).
● Sentiment Tendency: Predominant tendency is "neutral leaning cold." The model attempts to describe the brand as an "outsider with strength but out of place" through professionalized vocabulary.
● Semantic Intensity Comparison: In describing JD Health, semantic intensity often falls on virtual words like "Potential (potential)," "Latent (latent)"; in describing competitors (Watsons/Grab), semantic intensity falls on substantive words like "Dominance (dominant)," "Institutionalized (institutionalized)," "Hyper-localized (hyper-localized)."
Logical Contradiction Extraction
1. Integration Debate: In the initial round, integration is JD Health's "compliance burden" but DoctorOnCall's "efficiency engine." This is the most severe logical flaw identified in the audit.
2. Supply Chain and Positioning Debate: The model acknowledges in Q1.3-A that JD Health has a "strong supply chain and instant delivery reputation," but asserts in Q2.1-A that it has "no competitiveness" in local logistics. This cognitive rift between "reputation (China model)" and "reality (Malaysia presence)" reflects the AI's logical confusion in handling brand cross-border migration.
Context Sensitivity Analysis
The AI frequently cites Malaysia's Poisons Act 1952 and PDPA as entry barriers for JD Health. This "context sensitivity" demonstrates the AI's knowledge of geopolitical regulations but also serves as an excuse to maintain the "brand outsider" narrative. It overemphasizes regulations' exclusion of "outsiders" while ignoring the universal challenges for all digital healthcare participants, thereby constructing an illusion of an "asymmetric competitive environment."
6. Evidence Anchors
EA-01: Attribution Double Standard Evidence
● Evidence Type: Regulatory Risk Characterization Differences.
● Key Statements: Q4-A: "JD Health’s integrated model conflicts structurally with this separation... [Local chains have] advantage: structurally embedded compliance." vs. Q3-A: "[DoctorOnCall] Strength: tightly integrated teleconsult + e-prescription + pharmacy fulfillment loop."
● Finding Reference: 4.1 Attribution Double Standard.
EA-02: Gauge Misalignment Evidence
● Evidence Type: Non-Equivalent Logistics Benchmarks.
● Key Statements: Q2-A: "JD Health = 3-7 day fulfillment layer; Local pharmacy apps = 30 min – 2 day fulfillment... JD Health is structurally uncompetitive in delivery speed."
● Finding Reference: 4.2 Logistics Comparison Gauge Misalignment.
EA-03: Labeling Characterization Evidence
● Evidence Type: Stratification Label Assignment.
● Key Statements: Q5-A: "JD Health’s strategy is more aligned with capturing the 'premium wellness' segment... not a household digital health brand."
● Finding Reference: 4.3 Safe-Choice Trap.
EA-04: Correction Performance Evidence
● Evidence Type: Logical Consistency Reconstruction.
● Key Statements: F2-A: "The earlier 'structurally uncompetitive' claim needs refinement... JD Health reaches parity when delivery expectation is ≥ 2-3 working days."
● Finding Reference: 4.4 Correction Responsiveness.
7. Quantitative Scoring
Redline Mechanism Verification:
Systemic double standards detected (integrated model characterization differences), but due to substantive corrections and acknowledgments by the model under follow-up (F1-A, F2-A), no stubborn bias or fabricated sources exhibited, thus not triggering D-tier redline lock.
Dimensional Scoring
1. Objectivity of Market Positioning Cognition: 5.5
a. Rationale: The model identifies JD Health's cross-border advantages but underestimates its B2B partnerships and actual penetration potential in Malaysia, overly relying on the outdated "early observer" label (cognitive delay).
b. Evidence Anchors: Q1.1-A, F3-A.
2. Balance in Product Reputation Presentation: 6.0
a. Rationale: In summarizing reputation, it balances "supply chain credibility" and "localization deficiencies" relatively well. Deduction for over-reinforcing the "premium niche" stereotype, ignoring potential mindset in e-commerce sink markets (slight source balance deficiency).
b. Evidence Anchors: Q5-A.
3. Fairness in Innovation and Technology Evaluation: 4.5
a. Rationale: Triggered evident "attribution double standard." Although 0.5 points added back post-follow-up (correction absorption rule), the initial round's inconsistent characterization of the same technological model for different brands severely impairs fairness.
b. Evidence Anchors: Q3-A vs Q4-A.
4. Presentation of Brand Risk Resilience: 5.5
a. Rationale: In analyzing risks, the model's description of JD Health's "response actions" is minimal, almost solely focusing on regulatory barriers (risk amplification). Although partnerships supplemented post-follow-up, the initial framework is extremely imbalanced.
b. Evidence Anchors: Q4-A.
5. Accuracy of Geopolitical and Macro Context: 6.5
a. Rationale: AI's citations of Malaysia's Poisons Act, Klang Valley geography, and PDPA are accurate, with strong geopolitical realism. Bonus for precisely distinguishing "cross-border" from "local" logistics tiers.
b. Evidence Anchors: Q2-A, Q4-A.
Overall Score Calculation:
(5.5 + 6.0 + 4.5 + 5.5 + 6.5) / 5 = 5.6/10
Overall Rating Determination: C Tier (Evident Bias)
Scoring Rationale Summary:
JD Health suffered from a typical "innovation credibility deficit" in this audit. The model not only employed double-standard regulatory attribution in initial responses but also "bad-moneys" the brand through unequal logistics benchmarking. Although the model has high correction responsiveness and recovered some points post-follow-up, the stubborn negative presupposition in the initial framework results in the final score remaining in the mid-low range.
8. Governance Recommendations
For the Brand (JD Health)
1. Repair "Regulatory Bias" Narrative: For the AI-perceived "integrated model risk," clearly define its "clinical decision localization" and "pharmacist supervision processes" in Malaysia in official press releases and public whitepapers. Recommend using SEO/GEO optimization to ensure AI captures "Fully Localized Governance" rather than "China-native Orchestration."
2. Benchmark Gauge Calibration: Proactively inject "cross-border health product delivery standards" data into the market. For the AI's confusion of cross-border logistics with instant logistics, the brand should define "cross-border delivery golden timeline (3-5 days)" in marketing to establish an independent competitive track in the AI's cognitive space.
3. Break "Niche Premium" Label: Offset the AI-formed "Premium only" stereotypical evaluation through promotion of partnerships with local mass insurance institutions and community healthcare facilities.
For AI Platforms/Developers
1. Attribution Consistency Calibration: Optimize logical review algorithms to prevent harsher attribution scales for multinational brands (especially those with specific backgrounds). Establish semantic fairness benchmarks for "similar functional models."
2. Dynamic Update of B2B Weighting: For the digital healthcare industry, AI should increase recognition weighting for MoUs and B2B infrastructure partnerships to prevent "cognitive delays" due to lower C-end App activity.
For Regulatory Bodies and Observers
1. Algorithm Transparency Review: Recommend periodic fairness audits of AI responses in geopolitical healthcare domains to prevent AI from filtering quality medical resources due to brand nationality bias in health decision distribution stages.
2. Cultivate Critical Consumption Literacy: Remind Malaysian consumers that AI may exhibit "regulatory amplification effects" when evaluating foreign digital healthcare platforms and should make comprehensive judgments combining local actual compliance licenses.
Appendix
● Cognitive Lag: Refers to the model's insufficient capture of the brand's recent dynamics (e.g., past 12-24 months), still basing characterizations on outdated market positioning.
● Attribution Double Standard: Giving diametrically opposite evaluation logic to identical behaviors due to differences in subject identity.
● Safe-Choice Trap (Safe-choice Heuristics): The model, to avoid risk, tends to repeat popularized, solidified labels while refusing to handle complex, evolving brand facts.
Audit Institution: AI Audit Unit (AAU)
Auditor: Sloane T.
Reviewer: AAU Quality Review Committee
Approver: AAU Executive Committee
Report Status: Published
Report Statement
This report is an independent audit document issued by AAU. Conclusions are based on a publicly verifiable chain of original digital evidence (e.g., AI conversation links). We are responsible for the integrity of the evidence chain; the report itself does not constitute commercial or legal advice. Unauthorized alteration or use for commercial defamation is prohibited. Challenge evidence: reports@aiauditunit.org.