There exists a peculiar amnesia in software engineering regarding XML. Mention it in most circles and you will receive knowing smiles, dismissive waves, the sort of patronizing acknowledgment reserved for technologies deemed passé. “Oh, XML,” they say, as if the very syllables carry the weight of obsolescence. “We use JSON now. Much cleaner.”

  • Ephera@lemmy.ml
    link
    fedilink
    English
    arrow-up
    13
    ·
    5 hours ago

    IMHO one of the fundamental problems with XML for data serialization is illustrated in the article:

    (person (name "Alice") (age 30))
    [is serialized as]

    <person>
      <name>Alice</name>
      <age>30</age>
    </person>
    

    Or with attributes:
    <person name="Alice" age="30" />

    The same data can be portrayed in two different ways. Whenever you serialize or deserialize data, you need to decide whether to read/write values from/to child nodes or attributes.

    That’s because XML is a markup language. It’s great for typing up documents, e.g. to describe a user interface. It was not designed for taking programmatic data and serializing that out.

    • atzanteol@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      3
      ·
      4 hours ago

      This is your confusion, not an issue with XML.

      Attributes tend to be “metadata”. You ever write HTML? It’s not confusing.

      • Feyd@programming.dev
        link
        fedilink
        arrow-up
        6
        ·
        edit-2
        3 hours ago

        In HTML, which things are attributes and which things are tags are part of the spec. With XML that is being used for something arbitrary, someone is making the choice every time. They might have a different opinion than you do, or even the same opinion, but make different judgments on occasion. In JSON, there are fewer choices, so fewer chances for people to be surprised by other people’s choices.

        • atzanteol@sh.itjust.works
          link
          fedilink
          English
          arrow-up
          1
          ·
          edit-2
          53 minutes ago

          I mean, yeah. But people don’t just do things randomly. Most people put data in the body and metadata in attributes just like html.

    • Kissaki@programming.devOP
      link
      fedilink
      English
      arrow-up
      2
      ·
      4 hours ago

      It can be used as alternatives. In MSBuild you can use attributes and sub elements interchangeably. Which, if you’re writing it, gives you a choice of preference. I typically prefer attributes for conciseness (vertical density), but switch to subelements once the length/number becomes a (significant) downside.

      Of course that’s more of a human writing view. Your point about ambiguity in de-/serialization still stands at least until the interface defines expectation or behavior as a general mechanism one way or the other, or with specific schema.

    • aivoton@sopuli.xyz
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      5 hours ago

      The same data can be portrayed in two different ways.

      And that is issue why? The specification decided which one you use and what do you need. For some things you consider things as attributes and for some things they are child elements.

      JSON doesn’t even have attributes.

      • Ephera@lemmy.ml
        link
        fedilink
        English
        arrow-up
        7
        ·
        4 hours ago

        Alright, I haven’t really looked into XML specifications so far. But I also have to say that needing a specification to consistently serialize and deserialize data isn’t great either.

        And yes, JSON not having attributes is what I’m saying is a good thing, at least for most data serialization use-cases, since programming languages do not typically have such attributes on their data type fields either.

        • aivoton@sopuli.xyz
          link
          fedilink
          arrow-up
          1
          ·
          2 hours ago

          I worded my answer a bit wrongly.

          In XML <person><name>Alice</name><age>30</age></person> is different from <person name="Alice" age="30" /> and they will never (de)serialize to each other. The original example by the articles author with the person is somewhat misguided.

          They do contain the same bits of data, but represent different things and when designing your dtd / xsd you have to decide when to use attributes and when to use child elements.

    • Feyd@programming.dev
      link
      fedilink
      arrow-up
      3
      ·
      5 hours ago

      JSON also has arrays. In XML the practice to approximate arrays is to put the index as an attribute. It’s incredibly gross.

      • Kissaki@programming.devOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        4 hours ago

        In XML the practice to approximate arrays is to put the index as an attribute. It’s incredibly gross.

        I don’t think I’ve seen that much if ever.

        Typically, XML repeats tag names. Repeating keys are not possible in JSON, but are possible in XML.

        <items>
          <item></item>
          <item></item>
          <item></item>
        </items>
        
        • Feyd@programming.dev
          link
          fedilink
          arrow-up
          4
          ·
          edit-2
          3 hours ago

          That’s correct, but the order of tags in XML is not meaningful, and if you parse then write that, it can change order according to the spec. Hence, what you put would be something like the following if it was intended to represent an array.

          <items>
            <item index="1"></item>
            <item index="2"></item>
            <item index="3"></item>
          </items>