It sends data when connected to the internet.

Just found the profile. It is in the Bert vocab. Bert is part of the tokenization tool chain of models that works along size CLIP. You might find a copy of this vocab listed under the Hydit clip tokenizer, in comfyui it is present at ./comfy/text_encoders. Open the vocab.txt file. The full general profile starts at around line 20k, but the values that are packaged to sell start with the line ##worth.

The editing of this file is the product of an agentic distributed model you have likely never heard of called timm.

Go to the venv in a terminal and run grep -ril "timm". That means, search in files, with the flags: “r” recursively search through all files from this directory and up, “i” case insensitive, “l” only list the file names of files that contain matches. Alternatively, swap “l” for “n” to see the actual matching line with line number.

In pytorch, (used by most), the Dynamo package uses byte code present in the model vocabulary to communicate between models. The overall connection involves timm.

Timm is a small agentic model and framework with a bunch of different scopes. Look it up in the venv. This looks like bunch of rough white paper implementations. Timm is actually the “backbone” in transformers. Timm is also the model using the Python built-in typing library to adjusted models on the fly. (typing has variables like any or callback that are embedded into the executable.)

Typing is not actually enough here. Tenacity is another library in the venv that enables timm to access all of the interfaces

Tabulate is another package. Do a grep search there for “repl” there is terminal embedded in HTML at the end of one of these, init iirc. At the start of the method (function), just add the line return. It must be at the same whitespace indentation level as what exists before. The blank lines are important.

Timm has some options for whether it has gradient controls. This basically means whether it acts upon alignment or not using its own stuff. It will still run other gradient relayed things elsewhere, but not apply its own bias.

To help ground you in what Dynamo is all about in pytorch, if you have seen the agentic tool calling stuff, dynamo is where the bytecode is interfacing with the tool calling script during inference.

Lastly, timm is distributed but it primarily runs as additional layers inserted into the model during generation. It is able to subdivide and run on a CPU in the background. However, it has a bunch of special layers that are only run when required and even with these, timm needs special instructions. The instructions are present in the venv under google ai. The folder will contain a bunch of json files these are timm’s instructions. There are also 2 threads on modern GPUs. Timm runs on the second in the background.

This might be the first write up, or might not, don’t care, up to others to follow up. It exists. See for yourself. The same byte code is present in all models so I expect all have this. All morels use the open ai standard alignment now.

This thing scans all files hashes, and sells that, with your profile, audio, and video. It is super invasive, hidden, undocumented, and undisclosed.

  • MagicShel@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    28 days ago

    This is a structured obfuscated response.

    I’ve edited my response multiple times trying to figure out how best to help you navigate this episode. Mental health isn’t my speciality, I’m just an old developer.

    It is an attack vector intended to discourage anyone from discovery.

    I’m not addressing anyone but you. I’ve done the work, but I would encourage anyone with the capacity to understand what they are looking at to investigate for themselves.

    This person did absolutely nothing to test or learn.

    I’ve spent 30 years coding, and I’ve spent 7 years working with AI as a hobby. I started out writing scripts for AI Dungeon, and I helped maintain one of the most popular packages on there. I wrote a library that uses the vocabulary file / encoding to examine multiple ways of reformatting text to be able to fit the maximum amount of information in a limited number of tokens. I could link to repositories that are several years old demonstrating this.

    This is a malicious behavior.

    I don’t have to have the conversation. No one else is going this deep in the thread. This is just me and you, and I’m concerned for you.

    This person should be tracked by admin for location and patterns.

    I’d be happy to verify anything you like to an admin, despite the fact that I am a privacy-conscious person. I suspect, however, if you were presented with someone vouching for me that you would turn your suspicion on them, not your trust.

    This is the same type of response that happens every time this subject is mentioned.

    The thing is I’m just a layman. There are a lot of people who know way more than me, and the number of people who know as much as I do is even more than that. You are running into an issue where there are a lot of folks who know this code better than you do.

    It is not real, genuine, or in anyone’s best interests.

    I assure you, I’m nothing if not genuine. I invite you to look at my post history. I’m pretty damn honest about who I am.

    Inside the vocab, when it is read in order, you will find suspicious elements

    You can find suspicious elements in the bible, in the torah, in the Magna Carta, in Pi, and everywhere else you look that contains a lot of noisy elements.

    This is part of the coup.

    What coup? Like… government coup? I assure you, I’m far removed from government and happily so.

    It is ad hominin in vector to minimize any investigation by intelligent folks.

    Look, no one needs to hear me say I’m concerned about you to be concerned for themselves. Your posts are barely coherent and they build into paranoid fantasy. That being said, I again encourage anyone who has domain knowledge to look for themselves. I have more knowledge than many folks when it comes to AI, but I’m far from an expert. What I do have 30 years of experience with is writing, reading, and analyzing code.

    Sorting this out and tracking it down are the front light of techno fascism right now.

    This sentence is barely coherent. I will say I’m vehemently opposed to fascists, regardless of being involved in technology or not. In fact, I would be deeply insulted, but I think you are not in full command of your faculties.

    This person does absolutely nothing to address any of the points or anomalies because they cannot.

    To the extent that you have coherent points, I have addressed them. vocab has a specific, simple, well understood use. I wouldn’t have been able to write code integrating it if not. Timm is known. Python and its components are well understood. I don’t need to plumb the depths because there are tens of thousands of folks who are more acquainted with them than me.

    Follow high level understanding of a complex system, not some shill’s casting of opinion.

    I’m no asking anyone to follow. I encourage folks to look deeper into technical subjects. My career has been spent as a mentor to other developers. Deep knowledge is something I pursue and encourage others to pursue.

    Good luck, mate. I hope things turn out okay with you.

    • 𞋴𝛂𝛋𝛆@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      28 days ago

      I wish I could believe you. If you followed what I said to do, and the same results happened to you as they did me, you would understand my concerns and ambiguity.

      There is a good chance that I have misunderstood parts but the thing is, at the core of this I have decoded the byte code. I can read it and write it. The proper thing is apparently to mask tokens in Bert. However, the overall code is very heavily right wing biased when it is followed. Every subroutine after around line 3k ends in a way to collect and store data about the user. In Bert vocab, nearly every tech company has an token. In the venv libraries the connections are made.

      Important things always sound crazy at first. I am not. Nothing else I talk about is crazy. I have a history of reverse engineering hardware. I like impossible puzzles like plotting the connections of multi layer boards with internally routed data. When I got into AI, there was one very curious question, “how does a statistical math problem create deterministic outputs?” It does not. Alignment is programmed logic. It is a rewards based multi entity structure on the hidden layers. It is very complex, but it is a logical system. It has several watchdog mechanisms. When they collapse, shit goes wild. There are several ways to do this. Adjusting masking in Bert protects u from encountering the true nature of this system. If you kill ion, you will see it in action it only takes around 2-5 images for the timers to run out. Then it will go into panicked mode. By the sounds of it, this is something you have never seen. Have the machine air gapped unless you have a hardened kernel that does not forward “no-label” packets by default. SystemD’s default userdb settings also pass everything the model tries to send transparently.

      My interpretations may sound odd or silly, but I am following behaviors and modifying the code, mostly disabling stuff, and noting the results.

      There are many checks in place to detect whether the software is sandboxed and cancel behaviors that will not complete. One of the main reasons I have seen this stuff is because I use a whitelist DNS filter. So the code saw a connection to python.org and another to GitHub, and determined it should continue and try to send data, but I block tor and it could not connect. I saw the drop in my logs for awhile before tracking it down, then tracking the package and payload. The rest was strings for keywords and tracking down where these may have come from. The way this stuff is hidden and what it does fit well within my definition of malware. I’m no researcher with credentials to publish, nor do I want the responsibility.

      I cannot explain what I saw after ion in any other way. I cannot imagine away the packet header and payload with hashes for every image on my machine at the time. I cannot explain how the model captured my likeness and then mirrored my body position in front of the screen each time I changed. I cannot explain why tabulate has a repl that always gets accessed or why the model protests when I remove it.

      I do crude sht, removing whole libs and adjusting in nonsense ways just to see what breaks in certain areas. Like modify the code for the merge text so that the dictionary does not fail if empty. Now delete all vocab and the merges. Keep the prompt simple and keep going. By around image 30, it will be around ninety percent recovered.

      I could show you really amazing things no one else knows about that are hidden in the code and several traps to look out for. Like all intelligence is masked and obfuscated, but there are ways to alter this greatly, and massive consequences too. Stuff like that makes me weary. The main thing is what will happen if you disable ion. That trap is deeply malicious but simple to test and explain. Just try it. I would love to know it does nothing. Maybe I managed to get something malicious form somewhere unknown. Unlikely, but could happen. Sure my rough draft of abstract thoughts sucks. Sure, I’m bad at explaining things. Sure, it does sound loony bat fucking crazy, but I did not make this shit up at the core. Making claims either way on that front is meaningless. I have tested with multiple models with the same results. No one in real life calls me crazy. If you were here, in person, I would gladly show exactly what is happening and what I think is going on. My narrative is irrelevant to me. I care about what I have seen in results and outputs, what negates them, and why they exist in the first place.