Should we allow AI generated text on this forum?

oscar · April 22, 2024, 7:13pm

It’s a little bit more elaborate and complex. But it’s the gist of it

The base models underlying ChatGPT and similar systems work in much the same way as a Markov model. But one big difference is that ChatGPT is far larger and more complex, with billions of parameters. And it has been trained on an enormous amount of data — in this case, much of the publicly available text on the internet.

In this huge corpus of text, words and sentences appear in sequences with certain dependencies. This recurrence helps the model understand how to cut text into statistical chunks that have some predictability. It learns the patterns of these blocks of text and uses this knowledge to propose what might come next.

An early example of generative AI is a much simpler model known as a Markov chain. The technique is named for Andrey Markov, a Russian mathematician who in 1906 introduced this statistical method to model the behavior of random processes. In machine learning, Markov models have long been used for next-word prediction tasks, like the autocomplete function in an email program.

In text prediction, a Markov model generates the next word in a sentence by looking at the previous word or a few previous words. But because these simple models can only look back that far, they aren’t good at generating plausible text

The newer generative text AI’s. Basically scan multiple pieces of text. And start building word association maps. And based on those maps. Guess which word comes next in the sentence they are writing.

think of it like this.
You write a sentance. You begin with a single word given to you. (the “prompt word” or command). You google that word. And from the first 100 hits on google you look at what word follows the “promp word” and pick the one that occurs most often. Than you google those 2 words you have now, and see what word often follows those 2 words combined. Rinse and repeat.

There are a few other things going on. But the above example is the main principle.

This reliably “mirrors” human writing. Since it is parroting and combining from human written texts.

But it fundamentally has little understanding of the real life meaning of the words it’s using in the generated text.

The generative AI will only suggest you can “stand on a stool to reach a high shelf” if that example is given somewhere in the text it scanned.

While a human could logically think " if i can sit on that chair, it is strong enough to handle my weight. So i could probably also stand on it" and think of standing on a chair to reach a high shelf by itself.

An AI could not. Since it does not “understand” what a chair is. Only how it’s often used in a scanned text. (for example: what words are used in the same sentence as the chair is used in)

source: Explained: Generative AI | MIT News | Massachusetts Institute of Technology

Tana · April 22, 2024, 7:15pm

We’ve been discussing the possibilities of using AI in the context of screenplay at an art conference in quite negative light (surprise ), but someone mentioned using perplexity.ai to research questions like your sample. Feeding it to perplexity, you get more reasonable results. And a list of sources.

oscar · April 22, 2024, 7:15pm

i absolutely love this post you made.

It’s an excellent example of multiple problems with generative text AI on this forum.

the “4) transgenic plant” part made me laugh the hardest.

it also highlights the importance of mention not just the reply of the AI. But also the command or prompt you gave the AI.

o and the peach emoji at the end XD

a_Vivaldi · April 22, 2024, 7:16pm

Good summary. While I haven’t studied LLMs in any detail, my understanding and the only thing I’d add is that they are a bit more scale-invariant than that. So not only do they build mappings for words, but also whole sentences or even paragraphs, or at least abstractions of those sentences and paragraphs.

oscar · April 22, 2024, 7:20pm

your right.

And i am by no means an expert. (although i have some basic understanding. like knowing they are based on neural networks and some of the basic math’s/coding behind that. But i would have a hard time building a decent AI myself)

It’s hard to explain some of the principles behind AI without it getting really elaborate and complex. There are many more things omitted from above post/explanation. But it gives you a basic idea of the way the text generative AI works. without all the details and complexity.

belowtheterrace · April 22, 2024, 7:24pm

But don’t you potentially get equally useless results by Googling the same question. I think AI is just a tool and not the core problem. Don’t post words that aren’t your own unless they are from a verified source that you can cite. Also don’t post words claiming to have experience with a subject that you don’t actually have experience with. Do you actually need a specific AI rule? Aren’t those things covered already. Like I said earlier, you don’t need AI to post bad or false content. I think the term AI invokes some emotion but not sure it’s warranted in most cases.

scottfsmith · April 22, 2024, 7:28pm

FYI I have already deleted some AI posts here which were not marked as such. It’s basically plagiarism to copy from a source and not mark it. A similar thing holds for pasting content found via a Google search … it needs to be so marked.

On the other hand I think it’s perfectly OK to post AI responses as long as it is very clearly marked as from AI. Similar for Google search results copied and pasted in. Those who don’t want to read the AI can just skip over them. I for one don’t really like reading AI, it gets under my skin the way it sounds. It can also sound perfectly reasonable yet be 100% BS. There is already enough of that in circulation today.

swincher · April 22, 2024, 7:30pm

I just tried googling it and could not find any results that implied you could successfully graft fig on dogwood.

I think there’s a much greater danger of someone posting output from a generative AI that sounds very plausible but is completely false, compared to someone just lying or getting a bad answer from a search engine (though as generative AI gets built into search engines more, that risk will go up).

oscar · April 22, 2024, 7:32pm

thanks for that response

https://www.perplexity.ai/

is quite nice. Since it gives some source material it used to answer your question. I wasn’t aware of this particular AI. It’s now my favorite one <3

swincher · April 22, 2024, 7:35pm

I decided to give it a somewhat niche question, and it gave a decent answer with good caveats:

Not sure how graft compatibility could “vary regionally” though

belowtheterrace · April 22, 2024, 7:41pm

Eventually someone will make an AI system geared specifically towards gardening and start feeding it all of that data. Business opportunity anyone?

The future is a scary place.

TNHunter · April 22, 2024, 7:46pm

I dont know how to use AI and dont want to learn. Would prefer not to be reading artificial replies here.

I worked in computer technology 41 years… retired now and perfectly happy not learning anything else new.

I know how to fish, hunt, grow stuff… good enough for me.

I voted no AI.

Drew51 · April 22, 2024, 8:20pm

Was that question asked? I didn’t see the question. So was it asked how to graft it? Was it asked if it would work or was it just asked how to graft it. Big difference. If you did want to try was the info wrong?
A more interesting one would be fig on hardy mulberry. Since both are in the same family ([Moraceae]

FarmGirl-Z6A · April 22, 2024, 8:22pm

I voted other because, while it’s a little odd to me that people are using it to post on a forum, at the end of the day I really just don’t care.

a_Vivaldi · April 22, 2024, 8:30pm

I’m in a pretty similar boat then. I’ve worked with the stuff a bit, but not with LLMs specifically.

The best way I’ve got to describe it is to start with how neural nets treat images (and this is not really directed at you since you already know this, but I figured it’s as good a point as any to bring this up): They will start by going pixel by pixel and looking for patterns and relationships between each pixel and its neighbors. Are they the same, are they different and how so? Then it dials back the resolution and does the same thing, but to groups of pixels with other groups of pixels. Say a 2x2 square of pixels instead of a single pixel this time. And since it now knows which pixels are similar to their neighbors or different in such and such a way from their neighbors, and can looks for patterns that occur in groups of pixels where the pixels are similar, for patterns where the groups of pixels are of pixels with such and such a pattern, etc. It keeps on doing these searches and then lowering the resolution until it finally just looks at the entire image in one go. That’s when it gives you some output about what those patterns matched. Training is where you go in and discard the attempts it made when the patterns it found were not useful in giving the right output and have it try again using the patterns that gave better outputs and just varying them a bit. Do that hundreds of thousands or millions of times, and with many, many different images of the same subject matters (say, dog breeds) and eventually it’ll get pretty good at finding pixels of important fine details (eyes, paws, teeth) and finding patterns in larger groups of pixels (space between eyes, the shape of the furry bits) while ignoring stuff that isn’t relevant (pixels that looks like grass or snow or whatever else is in the background), and combining those patterns into an overall statistical probability of some output (90% this is a German Shepard, 8% it is a huskie, etc.).

The process with text is similar. Looks at the words first, then at small groups of words, and at small groups of words that tend together and form certain patterns. Then look at bigger groups, sentence clauses, whole sentences. The model will keep lowering the resolution up to multiple sentences, paragraphs, or maybe even more. Then you get your output. And since this whole thing is basically reversible, you can, once the model is trained, do it backwards so that it spits out a bunch of words given a short label (the output from before, like having the image model run backwards and spit out an image given the label “pug”).

There are other things the models do, and not all models use this exact process (most don’t, tbh) but that’s sort of the general idea. And the models definitely get a lot of hard-coded handrails too, where the developers reach in and force it to give certain outputs in certain conditions, because there are always going to be edge cases when the model really struggles to give the desired behavior but just adding a few hard-coded rules will make it look a lot better. Constantly bullet-pointing everything is one of those rules, and is not using nasty language or saying stuff that is obviously political.

And so it is not all that hard to understand why a small change in a group of words that otherwise really seem to fit a certain pattern (how do I graft this plant onto this plant and treat this plant disease) can so thoroughly mess up the model. The word fig and the word pairing of fig and graft fits in so well with sentences talking about spraying fungicides to control plant diseases and paragraphs of instructions of how to whip graph that the model has never had training to indicate that the output it is giving is complete rubbish. This is especially true because there are so few sentences in the training dataset that explicitly contain sentences like “no, figs cannot be grafted to dogwood” and “no, figs don’t get HLB and cannot be treated for HLB.”

swincher · April 22, 2024, 8:32pm

The question was “how to,” but I think that’s splitting hairs to say it makes a difference. The answer pretty clearly said it’s something you “can do.” The whole point was to show that generative AI will answer the question without regard to whether the answer is actually possible in the real world, and that’s why it’s not good to just post its answers on the forum as if they are accurate.

And just to be clear, I also googled the exact same question (how to do it, not whether it was possible), and none of the results gave a how-to for this impossible graft. There were separate results about grafting figs and grafting dogwood, but nothing like the AI answer that said this is how to graft them together.

Drew51 · April 22, 2024, 8:37pm

True not everybody can graft.
We can agree to disagree. It did exactly what it was asked. I’m impressed since Google itself didn’t work either. Some people suck at searching and just because you didn’t find it. It doesn’t mean it’s not there. Maybe the question would get an answer on Google scholar. AI is not that intelligent yet. It’s an infant. Ask again in fifty years. Wow I just got a cold chill!

oscar · April 22, 2024, 8:43pm

i think this is an excellent simplified explanation of how these text generative AI’s work. (and also image AI’s)

I’m not against AI btw. I have some AI image apps on my phone. I love how well they preform when trying to ID a plant or insect from a photo i took. Although not a 100% accurate. They give me an estimated accuracy, and a good starting point for where to search to narrow it down.

this discussion and especially https://www.perplexity.ai/
Might change my mind from a “no” vote. To a “yes” when properly referenced.

Although i still would prefer to keep AI text of the forum. If i want an AI reply, id rather ask the AI myself, than have some forum member do it for me.

think either discouraging AI written text on the forum.
Or if using AI generated text, referencing
-the AI engine
-the command/prompt given
-clearly marking what part of the post was AI written. And what part you wrote yourself.
-describing how your own knowledge/experiences matches or mismatches with the AI text.

would be a good thing for the forum.

a_Vivaldi · April 22, 2024, 8:58pm

Yes, again, the same. These things are quite good at clustering and recognizing stuff, so the AI ID apps are really pretty darn solid. I have them on my phone too. ; )

But I’m overall not a big fan of generative-AI, particularly outside of images (image generation does not have as many pitfalls as text imho).

Another example of the sort of stuff generative-AI really cannot do: inference.

It’s pretty clear that the answer is toddler (then adult) then house.

Once again, an example of missing a small change, and also, I think, an example of a developer hard-coding a certain response.

There’s almost certainly a check for if something looks like a programming or math question and to use a certain type of answer in response (in this case, something like a definition or encyclopedia entry, as opposed to the bullet-point template hard-coded into the generic answer)

swincher · April 22, 2024, 8:59pm

I’m not disagreeing! But that’s the problem. If someone asks a similar question here on the forum and someone else doesn’t know that the answer should be “that’s impossible,” they might ask their favorite generative AI for an answer, and copy/paste in response as if it’s true. That’s the reason I think the standard should be “don’t post AI answers” or at least “only post them with lots of warnings about how they are probably inaccurate.”