Return to site

78. Unleashing AI Wizards in Code Realms with William Zeng

· AI,podcast

 

broken image

 


Podcast with William Zeng Part 3

 

Summary:

Unlock the secrets of the AI-driven coding revolution inthis captivating podcast featuring William Zeng. Prepare to be astonished by Sweep,
the powerful software that collaborates with developers to streamline their
coding processes. From fast code recommendations to identifying issues, Sweep
is redefining the developer's workflow. Embrace the future of coding as William
reveals how AI can transform mundane tasks and let developers soar to new
heights of innovation and productivity. Tune in now to unleash the potential of
AI wizards in the captivating realms of code.

In this intriguing podcast, Andrew Liew delves into thefascinating world of artificial intelligence with William Zeng, an AI expert. They explore the innovative software called Sweep, developed by William's team, that has the power to assist developers in coding tasks. With Sweep's AI
capabilities, developers can receive code recommendations, identify bugs, and
enhance their coding speed. William demonstrates how Sweep handles various
scenarios, its strengths, and how it deals with mistakes. He also addresses concerns about the security of using AI for code review and explains the importance of human oversight in the development process. Discover how AI is transforming the development landscape, paving the way for developers to focus on complex problem-solving and unleashing their creativity.

[00:00:00] Andrew Liew: So check GPT that sometimes it,it gives like a hallucination effect or rather some outcome is not helpful or
not true. How do the end user who's a dev work it out with sweep?

[00:00:11] Andrew Liew: Is there some kind of featurewhere there's it's a positive reinforcement or corrective mechanism

[00:00:16] William Zeng: around there? Yeah. So I'llactually show you one where it failed. And then, so in this one, it actually
didn't work. I think it. It actually forgot to, so this is the one that we just
ran.

[00:00:26] William Zeng: It actually didn't define theget initials one. So I can say, you didn't define the get, get initials method.
So if I can, this is how you respond to a junior developer as well. You, if
they wrote a mistake, I would tell them, Hey, you forgot to write this method.
And then what will happen is so people run in the meantime, and then.

[00:00:46] William Zeng: Actually tried to fixeverything. Wow. I believe this one actually needs to actually We just added a
change yesterday.

[00:00:53] Andrew Liew: How long it takes for sweep, likefor this particular case, how long it takes for sweep to acknowledge that the
response [00:01:00] that you give to him

[00:01:01] William Zeng: this one we have kind of two, wehave two methods. The first one is, you can see If you see this one so you'd
actually have used his own PR. So it says the last couple of ones succeeded. So
we're, I think we're five out of six so far, and we got the one that didn't
succeed in this case. So it says, Hey you replaced these hardcore initials, but
I can't find where this function is defined.

[00:01:21] William Zeng: So we need to make sure todefine this function. And then typically sweep will address the issue here as
well, or take comments. So you can see here this should be fixed. This does. It
actually just added the method that I, because I commented on it and this
builds just succeed. So if you see now it just defined the function that I
asked it to define.

[00:01:42] William Zeng: Wow.

[00:01:43] Andrew Liew: Yeah, it's pretty eye openingbecause I yeah, like I think I, I seen a similar product before, but it took, I
think a very long gestation period to just work on a very small corpus of code
base. Do you have tested this limit of how far I [00:02:00]can go in terms of browsing or combing through the code base?

[00:02:03] William Zeng: Yeah, so are we actually have Ithink our record for the largest code base that we run is. One of our customers
has a monolith repo. If for those who don't know, a monolith repo is where a
company puts all of their code in one folder, basically, or one directory. Is
that a good explanation of a monolith repo?

[00:02:23] Andrew Liew: Yeah I, yeah, you put everythinginto one page or like one, one piece of document, one file. But how many
estimate lines of code are we talking about? Like a hundred million, a million?
I don't know.

[00:02:34] William Zeng: Yeah. So we can I actually don'thave entire redaxes, but I can give you an estimate.

[00:02:39] William Zeng: Also, as you saw, this one justfixed the the issue and it should also be working now. I'll just show you that
one. So it's also fun when sometimes the demo doesn't work and then we can see
how we would fix it. It doesn't always work. So in this demo, it also has the
fix of even when it doesn't get right on the first try.

[00:02:56] William Zeng: We definitely think thathallucinations, though, it's definitely not [00:03:00]going to get it every time. So we put in these, we use this kind of code review
tools to allow sweep to recover from mistakes. And I can go more into that
after this, but to answer your question, sorry. It's

[00:03:10] Andrew Liew: good that you have a mechanismthat you're working on to, to correct the, sweep basically to enable it to be a
smart intelligence, but the question is like, what is the percentage of
accuracy at the moment As in giving you a false positive or a false negative,
you

[00:03:25] William Zeng: know, in that sense.

[00:03:25] William Zeng: Yeah, so that one we'd actuallydo not track at the moment. But one way that we can rely on this is because
it's a code review. It's typically it's very highly scrutinized with good
testing and good review practices. I think that's how you can make AI generated
code work because a lot of people are really concerned.

[00:03:45] William Zeng: And to be fair, we're alsoconcerned. We, review all PRs that go into our code base as well. So, going
back to that, I think it's for the model to be better. There are some things we
can do, and we do a lot of those things, but the ultimate responsibility is on
the reviewer, just like at [00:04:00] anysoftware company working with the junior dev, the senior developers.

[00:04:03] William Zeng: It's their job to make sureeverything is good. Yes.

[00:04:06] Andrew Liew: Yes. I definitely get whereyou're coming from. Okay. What is your view? Now, let's move on to the next big
question in terms of What is your view on the adoption of this kind of large
language models for deaf workers? They need You know, great.

[00:04:20] Andrew Liew: And I create fast work done.

[00:04:21] William Zeng: Yeah. So that's something we'rereally focused on. I think there's two focusing on that fast part, I think
there's two kinds of fast, the first kind is where you sit there and then it
comes back to you immediately. And the second kind is where. You set it and
then you don't have to sit with it, just like letting it go off and then you
can come back in 15 minutes or when you're done with your next task, come back
and review it.

[00:04:44] William Zeng: That's two. That's two. So we'refocused on definitely getting the user to see something very quickly. To let's
say it's urgent, you need to know how to solve the problem, right? Yes. But on
the other hand, sometimes when the model doesn't get around first try, we need
to build in methods for the model to [00:05:00]recover and fix its own mistakes.

[00:05:02] William Zeng: So that's why we really focus onlike you said like, I mentioned the model reviewing its own code and the model
running GitHub actions and kind of these CI pipeline. So when it fails. It can
see the error and then rewrite the code, just like how a junior developer would
just again, a lot of it is looking at the error, writing the code again, seeing
if it passes.

[00:05:23] Andrew Liew: Okay. So along the line, like youmentioned about they read the code, write and review there are always cases
where there could be potential bad actors going into this. Like suite software
or any software base and put some, or inject some malicious code or autonomous
code.

[00:05:42] Andrew Liew: How does how does Suite enablethat? It's bought or its, assistant is secured as much as possible. Yeah.

[00:05:51] William Zeng: So for there's a couple layersof security we can talk about. The first one is based on GitHub. So we can
because GitHub has [00:06:00] a really great.It's created by default, like there's these personal access tokens.

[00:06:04] William Zeng: They have their installationbased authentication. So sweep cannot do. There's a lot of things we can't do
even though there's things we can do and it can't do. It cannot merge a PR. So
by default, even if you had full permissions, you couldn't merge a PR and you couldn't
deploy code. Unless unless someone typically people will gate their deployments
on a merge.

[00:06:24] William Zeng: So if you can't merge, you can'tdeploy. And then going back. That's the first part. So on the kind of code
security. But then, on the other hand, how do we store your code? So we have
two main components that we work with. First is our infrastructure, which is
SOC 2 certified.

[00:06:40] William Zeng: So they require our solution.Have a baseline of security. The second part is how we index your code. And the
way we do that is we actually do not store any code in plain text. So what we
do is we store the embeddings, which are not plain text and really difficult to
reconstruct. It's very hard to turn embeddings back into text, right?

[00:06:59] William Zeng: Yes. [00:07:00]And what we do is we actually key the the key of that, the embedding is based
on the code itself. The key is based on your file name and the GitHub SHA. So
each SHA represents. A point in time of a code base. So for example, let's just
say you make one change every day. Today's code is probably different from
yesterday's code.

[00:07:17] William Zeng: So what we'll do is we'll saytoday and file a, then we'll say yesterday, file a. So then if you're looking
at file a today, file a, we'll be able to give you what is the, what code and
then what code is reflected based on the search and no plaintext is stored
anywhere except for when you make the. And then what will happen is for about
five minutes while this request is happening, our server will spin up, it will
pull the code from GitHub, write the issue and then send it back.

[00:07:48] William Zeng: And the other part of thisOpenAI, which is which they state that they don't store your code for training,
but they only store it for debugging and misuse. So if there's any issues with
that I think overall, [00:08:00] those are the.The overview of how Suite handles security.

[00:08:03] William Zeng: In other words, you,

[00:08:04] Andrew Liew: you guys have these guardrailsthat you guys in place, like you say, you cannot merge, you cannot deploy, so
it's not able to enable this code overriding in that sense. And then the other
part is the storing of the code where you have, like you say, you have the
baseline certification, which is the SOP2, and then your indexing is a form of
encryption.

[00:08:21] Andrew Liew: Does it also Is there some kindof audit trail of Sweep, so that if something happens, somebody can go back and
check, Hey what did Sweep has done or somebody has done it with Sweep I

[00:08:33] William Zeng: don't know. Could you give mecould you clarify that a little bit more, like Do you mean what kind of changes
Sweep has made, or what kind of...

[00:08:39] William Zeng: Yeah,

[00:08:39] Andrew Liew: yeah,

[00:08:39] Andrew Liew: what what kind of what kind ofThink of it like, because it, even though there's a guardrail, it cannot...
Write the code or deploy or merge, but it can read and give recommendations,
right? so there could be a bad actor that somehow go into this code base and He
uses sweep to read it.

[00:08:56] Andrew Liew: Think of it as like a robberEnter and then he [00:09:00] use a robot to goand figure out where's the safe, you know something. Oh for sure Yeah, that's

[00:09:06] William Zeng: That would be really bad and forso far we're really strict about how we handle our privacy of our keys and all
the permissions we pass.

[00:09:13] William Zeng: But for people who are reallyconcerned about safety as well, which is some people want the trade off of
hosting versus self deployment. We do have an option for you to deploy sweep
yourself. And that would give you then you the security would be your
responsibility and only your responsibility, even though of course, when you
have a hosted solution, we handle that.

[00:09:33] Andrew Liew: Okay, we do have that. Okay, nowlet's go back to the next interesting question. What is your view of Artificial
intelligence with reference to dev work and work productivity too what, would
what, should, or what would the modern developer work going to be with tools
like suite in the next five years?

[00:09:51] Andrew Liew: Yeah, I

[00:09:53] William Zeng: hope with AI working around yourcode base and working with you, it'll really free you up to make the complex
decisions that I [00:10:00] cannot make yet. Soright now we haven't seen like an AI that's. That's really good product
creativity or very creative, like algorithmic problem solving.

[00:10:08] William Zeng: So mostly it's seeing solutionshas, but the thing is that tech that has is mostly stuff that has been seen
many, times. So I think tech that is a perfect place to start. And it feels
good with how I like to work. I like to work on solving hard problems and not
necessarily adding stuff like documentation or stuff like that.

[00:10:25] William Zeng: So that's my view on.

[00:10:26] Andrew Liew: Yeah, so you're saying that maybein the next five years the developers creativity will be released to focus on
more complex algorithms I don't like figuring out showing this cat like complex
encryption problems instead of Oh Oh this, feature works if then work and just
put a documentation just for audience who understand that the complexity of it.

[00:10:50] Andrew Liew: Developers can do. Yeah, I thinkit also, does it also because when it helps to reduce the tech debt, it also
increase the [00:11:00] velocity rate torelease features faster. In other words maybe the end user product could more
usable forms or functions. What do you think?

[00:11:09] William Zeng: Yeah, no, definitely. I reallyliked the idea of also increasing developer velocity. I think what we can think
of it as like maybe one comparison is how Excel changed how we work with data
because. Because before there's no, it's just the same thing. If you were to
write a simple, like a logging statement, it's the same thing as to me, I
don't, I wouldn't want to solve problems that a calculator can solve.

[00:11:33] William Zeng: Yeah. If you think about it, it wouldn'tfeel good. Yeah. So then same with kind of tech that encoding, I wouldn't want
to do, I wouldn't want to write code that GPT 4 can write. That's that wouldn't
be fun to me. Yeah. Yeah. So

[00:11:44] Andrew Liew: Enable the evolution of adevelopment. Of products or software to be at a higher level, like to think of
all the interopability or intercom algorithm problems rather than to this thing
is the same old stuff again.