파이썬 메모...

#Sublime Text에서 실행방법 : F7 또는 Ctrl+B

#웹브라우저에서 F12, Shift+Ctrl+C

F12가 안되면 마우스 우클릭 후 "검사"를 누르시거나 Ctrl + Shift + i 를 누르셔도 됩니다!

#https://www.crummy.com/software/BeautifulSoup/bs4/doc/

#https://www.youtube.com/watch?v=ZTJjW7XuHIY

#https://3log2.tistory.com/34?category=955630

Tab : 들여쓰기

Shift+Tab : 앞으로 붙이기

한 줄 주석 #

여러줄 주석 (따옴표3개) """ 내용 """

#소스는 구글 검색 창에서 "python 하고자 하는 행위"로 검색해서 찾으면 된다.

python write text file

https://wikidocs.net/26

구글이미지 다운로드하기

프롬프트 창을 열고

(설치) : pip install google_images_download

(삭제) : pip uninstall google_images_download

https://eneun.tistory.com/10

##################################################

for anchor in soup.filnd_all('a'):

print(anchor.get('href','/'))

soup의 내용중에서

find_all('a') : 모든 a 태그를 찾아서

anchor : 이 변수에 넣어라

anchor의 내용중에서 href(=주소)를 가져와서 출력해라.

##################################################

웹브라우저에서 F12를 누르고 상단에 Elements를 선택

Shift+Ctrl+C를 눌러서 크롤링 할 대상물의 코드가 어떻게 생겼는지 그 규칙을 파악해준다.

규칙성을 찾아야 한다.

예를 들면, 공통적으로 태그 안에 감싸져 있다?

은 다른 곳에서도 적용될 가능성이 있으므로 점점 더 좁혀서 찾아야 한다.

예) class가 모두 "an_k"로 되어 있는 걸 확인

keyword

그러면 태그이면서 (AND) class가 "ah_k"인 걸 가져오면 된다.

코드를 외우지 말고 구글에서 BeautifulSoup를 검색해서 공식문서로 들어가서 사용방법을 찾으면 된다.

https://www.crummy.com/software/BeautifulSoup/bs4/doc/

b.select("title")

title 태그를 선택할 수 있다.

b.select("body a")

<body></body>안에 있는 모든 a태그를 선택

###########################################################

i = 1

for anchor in soup.select("span.ah_k"):

print(str(i) + "위 : " + anchor.get_text())

i = i + 1

soup.select("span.ah_k")

soup의 내용중에서 span태그들 중에 class가 ah_k인 태그들만 선택

print(anchor.get_text())

anchor의 내용중에서 문자만 추출해서 보여주라

str(i)

i는 숫자인데 문자와 연결하기 위해서 문자화 시켜준다.

파일쓰기

f = open("C:/doit/새파일.txt", 'w')

for i in range(1, 11): data = "%d번째 줄입니다.\n" % i f.write(data) f.close()

토트넘

리버풀

이런 유형이면 어떻게 크롤링 할 수 있나요? 아무리 찾아봐도 잘 모르겠어요ㅠㅠ

두개 모두를 크롤링하려면 그 부모 태그를 이용해 잡아야합니다!

토트넘

리버풀

</div>

for anchor in soup.select("div#hello span"):

예를들어서 이렇게 되어있으면 hello라는 ID를 가진 div태그의 자손을 선택하는 것인데 그러면 2개의 span태그가 모두 선택됩니다ㅎㅎ

---

번호대신에 $ 지정해주면 될거같네요. 그러니까 select("span.team$") 이렇게요

"""

google_images_download 2.8.0

pip install google_images_download

https://pypi.org/project/google_images_download/

https://google-images-download.readthedocs.io/en/latest/arguments.html

"""

from google_images_download import google_images_download #importing the library

response = google_images_download.googleimagesdownload() #class instantiation

arguments = {"keywords":"Polar bears,baloons,Beaches","limit":20,"print_urls":True} #creating list of arguments

paths = response.download(arguments) #passing the arguments to the function

print(paths) #printing absolute paths of the downloaded images

구글 업데이트로 이 방법으로는 불가

https://tiktikeuro.tistory.com/174

'컴퓨터 > Python' 카테고리의 다른 글

[파이썬] 함수, 사용자입출력, 파일입출력 (유튜브 보다가 메모) (0)	2020.08.17
[파이썬] 제어문 (조건문 if문, 반복문 while문 for문) 유튜브 보면서 메모.. (0)	2020.08.17
[파이썬] 자료형2, 튜플,딕셔너리, 집합, 불, 변수와 메모리, 변수 만드는 방법 (유튜브 보면서 메모) (0)	2020.08.17
[파이썬] 자료형, 리스트 추가/삭제 (유튜브 보면서 메모) (0)	2020.08.16
파이썬(Python) 공부 메모 (3)	2020.03.14

일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28

끄적끄적

파이썬 메모...

'컴퓨터 > Python' 카테고리의 다른 글

티스토리툴바